feat: add audio/video file type support for structured extraction by devin-ai-integration[bot] · Pull Request #3 · nexla-opensource/nextract

devin-ai-integration · 2026-03-18T22:59:17Z

Summary

Adds audio and video file handling to nextract's extraction pipeline. Audio (.mp3, .wav, .m4a, .ogg, .flac, .aac, .wma) and video (.mp4, .webm, .mov, .avi, .mkv, .wmv) files are now recognized and attached as BinaryContent with their native MIME types, following the same pattern as image file handling.

Changes:

mimetypes_map.py: New _AUDIO_EXTS/_VIDEO_EXTS sets and is_audio()/is_video() helpers (mirrors existing _IMAGE_EXTS/is_image())
files.py: Audio/video branches in _prepare_single_file(), placed between PDF and office-binary handling
README.md: Replaces "Not supported: Audio/Video" with documented audio/video support in both the scope and file-type-handling sections

This is Phase 2 of the audio/video structured extraction plan. Companion PR in veda-ai adds these extensions to SUPPORTED_FILE_TYPES (Phase 1).

Review & Testing Checklist for Human

No unit tests added — the new is_audio()/is_video() functions and the audio/video branches in _prepare_single_file() are untested. Consider whether tests should be required before merge.
MIME type guessing for .m4a/.wma — these rely on mimetypes.guess_type() from stdlib (no custom entries in _CUSTOM). Verify guess_mime(Path("test.m4a")) returns audio/mp4 or similar on your target environment, not application/octet-stream.
Memory usage for large video files — path.read_bytes() loads the entire file into memory. Acceptable for the existing image pattern, but video files can be 100MB+. Verify this aligns with expected file size limits upstream.
Test plan: Create a small .mp3 and .mp4 file, call _prepare_single_file(Path("test.mp3")) and verify the returned PreparedPart has binary set with the correct media_type. Also verify an end-to-end extract() call with a model that supports audio/video input (e.g. Gemini) actually produces structured output.

Notes

Audio/video are pure binary passthrough — no transcription or frame extraction. This relies on the downstream LLM supporting native audio/video input tokens.
The extension sets match what was added to veda-ai's SUPPORTED_FILE_TYPES in the Phase 1 PR.

Link to Devin session: https://app.devin.ai/sessions/66c886f2998044ce8279af8a4c5d8a51
Requested by: @mihir-nexla

Add audio extensions (mp3, wav, m4a, ogg, flac, aac, wma) and video extensions (mp4, webm, mov, avi, mkv, wmv) support: - Add _AUDIO_EXTS and _VIDEO_EXTS sets to mimetypes_map.py - Add is_audio() and is_video() helper functions - Add audio/video handling in _prepare_single_file() as BinaryContent with native audio/*/video/* MIME types - Update README.md to document audio/video support and remove from 'not supported' section Co-Authored-By: mihir.pamnani <mihir.pamnani@nexla.com>

devin-ai-integration · 2026-03-18T22:59:20Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

abhijit914 approved these changes Mar 23, 2026

View reviewed changes

saksham-nexla merged commit 970488d into main Mar 23, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add audio/video file type support for structured extraction#3

feat: add audio/video file type support for structured extraction#3
saksham-nexla merged 1 commit intomainfrom
devin/1773874530-nextract-audio-video-support

devin-ai-integration Bot commented Mar 18, 2026

Uh oh!

devin-ai-integration Bot commented Mar 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

devin-ai-integration Bot commented Mar 18, 2026

Summary

Review & Testing Checklist for Human

Notes

Uh oh!

devin-ai-integration Bot commented Mar 18, 2026

🤖 Devin AI Engineer

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants