Skip to content

Failing Flash Attention 2 tests #29942

@ylacombe

Description

@ylacombe

System Info

  • transformers version: 4.40.0.dev0
  • Platform: Linux-5.4.0-166-generic-x86_64-with-glibc2.29
  • Python version: 3.8.10
  • Huggingface_hub version: 0.20.2
  • Safetensors version: 0.4.2
  • Accelerate version: 0.28.0
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.1.0+cu121 (True)
  • Tensorflow version (GPU?): 2.13.1 (True)
  • Flax version (CPU?/GPU?/TPU?): 0.7.0 (cpu)
  • Jax version: 0.4.13
  • JaxLib version: 0.4.13
  • Using GPU in script?: Yes - A100 80GB

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

RUN_SLOW=1 pytest tests/models -k "flash_attn"

Results:

FAILED tests/models/bark/test_modeling_bark.py::BarkSemanticModelTest::test_flash_attn_2_from_config - ValueError: Unrecognized configuration class <class 'transformers.models.bark.configuration_bark.BarkSemanticConfig'> for this kind of AutoModel: AutoModelForCausalLM.
FAILED tests/models/bark/test_modeling_bark.py::BarkCoarseModelTest::test_flash_attn_2_from_config - ValueError: Unrecognized configuration class <class 'transformers.models.bark.configuration_bark.BarkCoarseConfig'> for this kind of AutoModel: AutoModelForCausalLM.
FAILED tests/models/bark/test_modeling_bark.py::BarkCoarseModelTest::test_flash_attn_2_generate_padding_right - AssertionError: False is not true
FAILED tests/models/gemma/test_modeling_gemma.py::GemmaModelTest::test_flash_attn_2_generate_padding_right - AssertionError: ValueError not raised
FAILED tests/models/gpt2/test_modeling_gpt2.py::GPT2ModelLanguageGenerationTest::test_flash_attn_2_generate_padding_left - AssertionError: Lists differ: ['<|e[102 chars]y of Bali, and who was a member of the Muslim [101 chars]rry"] != ['<|e[102 chars]y of Kolkata, was a member of the Kolkata', "H[85 chars]rry"]
FAILED tests/models/gpt_bigcode/test_modeling_gpt_bigcode.py::GPTBigCodeModelTest::test_flash_attn_2_generate_padding_right - AssertionError: False is not true
FAILED tests/models/gpt_neo/test_modeling_gpt_neo.py::GPTNeoModelTest::test_flash_attn_2_generate_padding_right - AssertionError: False is not true
FAILED tests/models/gpt_neox/test_modeling_gpt_neox.py::GPTNeoXModelTest::test_flash_attn_2_generate_padding_right - AssertionError: False is not true
FAILED tests/models/stablelm/test_modeling_stablelm.py::StableLmModelTest::test_flash_attn_2_generate_padding_right - AssertionError: False is not true
FAILED tests/models/whisper/test_modeling_whisper.py::WhisperModelTest::test_flash_attn_2_inference_padding_right - AssertionError: assert False
FAILED tests/models/whisper/test_modeling_whisper.py::WhisperStandaloneDecoderModelTest::test_flash_attn_2_inference - AssertionError: assert False
FAILED tests/models/whisper/test_modeling_whisper.py::WhisperStandaloneDecoderModelTest::test_flash_attn_2_inference_padding_right - AssertionError: assert False

Expected behavior

Flash Attention tests are all expected to pass.
I'll look into Bark, the rest of the models failing are:

  • Whisper (cc @sanchit-gandhi)
  • StableLM
  • GPTNeo
  • GPTNeoX
  • GPTBigCode
  • GPT2
  • Gemma

cc @ArthurZucker @amyeroberts

Metadata

Metadata

Assignees

No one assigned

    Labels

    AudioGood Second IssueIssues that are more difficult to do than "Good First" issues - give it a try if you want!

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions