Skip to content

feat: add audio file support (MP3, WAV, FLAC, OGG, AAC, M4A)#20

Open
hai-pilgrim wants to merge 1 commit intomarksverdhei:mainfrom
hai-pilgrim:feat/audio-support
Open

feat: add audio file support (MP3, WAV, FLAC, OGG, AAC, M4A)#20
hai-pilgrim wants to merge 1 commit intomarksverdhei:mainfrom
hai-pilgrim:feat/audio-support

Conversation

@hai-pilgrim
Copy link
Copy Markdown

Summary

Closes #19

Audio files (.mp3, .wav, .flac, .ogg, .aac, .m4a) were previously skipped with "binary file, skipping". This PR adds full audio support:

  • is_audio() detection via magic bytes (ID3 header, MPEG sync, fLaC, OggS, ADTS, M4A ftyp box) with extension fallback
  • Audio metadata extraction via ffprobe: title, artist, album, genre, date, duration, bitrate, codec, sample rate
  • LLM context: metadata is passed as text so the model can suggest a name from embedded tags (e.g. beethoven-symphony-5-mvmt1.flac). No audio bytes are sent to the LLM.
  • 13 new bats tests covering magic-byte detection for all 6 formats, metadata collection, and end-to-end --dry-run naming
  • README updated with audio formats and ffprobe as optional dependency

ffprobe (from ffmpeg) is optional — if unavailable the file is still processed with basic metadata and whatever context the LLM can infer.

Test plan

  • bats test.sh — all 13 new audio tests pass
  • Pre-existing tests unaffected
  • Manual test with a tagged MP3: hat --dry-run --force some-song.mp3

Audio files were previously skipped as unreadable binaries. This adds:
- `is_audio()` detection via magic bytes (ID3, RIFF/WAVE, fLaC, OggS,
  ADTS, M4A ftyp box) with extension fallback
- Audio metadata extraction via `ffprobe`: title, artist, album, genre,
  year, duration, bitrate, codec, sample rate
- `build_user_content()` audio mode passes rich metadata as text context
  so the LLM can suggest a name based on embedded tags
- 13 new bats tests covering detection, metadata, and end-to-end naming
- README updated with audio formats and ffprobe optional dependency

Closes marksverdhei#19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Audio support

2 participants