Skip to content

mlx-whisper OOM error on files > 1GB #1366

@stickystyle

Description

@stickystyle

When I try to transcribe large files, mlx-whisper is consistently crashing with kIOGPUCommandBufferCallbackErrorOutOfMemory. Do you have any advice as to what flags to use to assist with processing larger files? I've tried different models and specifying the language with no difference in outcome.

(.venv) rparrish@oracle absrefined % python --version
Python 3.11.12
(.venv) rparrish@oracle absrefined % uv pip list|grep mlx
mlx                0.25.1
mlx-whisper        0.4.2
(.venv) rparrish@oracle absrefined % ls -l temp
total 6269632
-rw-r--r--@ 1 rparrish  staff  1202700876 May  4 17:05 1f7afc6c-246e-4d6c-943b-0223cc4f27e5_full.m4a
-rw-r--r--@ 1 rparrish  staff  1133814881 May  4 21:25 90ce63ba-2de8-4ab5-8fc4-5e367dad52df_full.m4a
-rw-r--r--@ 1 rparrish  staff   527000201 May  4 18:47 da1dfe53-2846-45a5-ba7f-61ef08221d5f_full.m4a
-rw-r--r--@ 1 rparrish  staff     7272845 May  4 18:52 da1dfe53-2846-45a5-ba7f-61ef08221d5f_full_audio.jsonl
-rw-r--r--@ 1 rparrish  staff   316813460 May  4 17:26 fba0c82e-22a4-443d-9fa4-6b7da7548f14_full.m4a
-rw-r--r--@ 1 rparrish  staff     4435343 May  4 17:30 fba0c82e-22a4-443d-9fa4-6b7da7548f14_full_audio.jsonl
(.venv) rparrish@oracle absrefined % mlx_whisper temp/1f7afc6c-246e-4d6c-943b-0223cc4f27e5_full.m4a
Args: {'audio': ['temp/1f7afc6c-246e-4d6c-943b-0223cc4f27e5_full.m4a'], 'model': 'mlx-community/whisper-tiny', 'output_name': None, 'output_dir': '.', 'output_format': 'txt', 'verbose': True, 'task': 'transcribe', 'language': None, 'temperature': 0, 'best_of': 5, 'patience': None, 'length_penalty': None, 'suppress_tokens': '-1', 'initial_prompt': None, 'condition_on_previous_text': True, 'fp16': True, 'compression_ratio_threshold': 2.4, 'logprob_threshold': -1.0, 'no_speech_threshold': 0.6, 'word_timestamps': False, 'prepend_punctuations': '"\'“¿([{-', 'append_punctuations': '"\'.。,,!!??::”)]}、', 'highlight_words': False, 'max_line_width': None, 'max_line_count': None, 'max_words_per_line': None, 'hallucination_silence_threshold': None, 'clip_timestamps': '0'}
Fetching 4 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 93206.76it/s]
Detecting language using up to the first 30 seconds. Use the `language` decoding option to specify the language
libc++abi: terminating due to uncaught exception of type std::runtime_error: [METAL] Command buffer execution failed: Insufficient Memory (00000008:kIOGPUCommandBufferCallbackErrorOutOfMemory)
zsh: abort      mlx_whisper temp/1f7afc6c-246e-4d6c-943b-0223cc4f27e5_full.m4a
/Users/rparrish/.local/share/uv/python/cpython-3.11.12-macos-aarch64-none/lib/python3.11/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
^C%

Image

The 500MB file shown in the directory will transcribe without issue, with only a moderate memory spike before processing.

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions