Code to train a custom time-domain autoencoder to dereverb audio
-
Updated
Nov 30, 2023 - Python
Code to train a custom time-domain autoencoder to dereverb audio
Automated audio/video ML pipeline for detecting and transcribing jazz solos from live recordings. Runs nightly against Smalls Jazz Club archives: uses CLAP (instrument detection), Demucs (source separation), CLIP (performer identification), and basic-pitch (MIDI transcription). Results served via REST API.
ML-based speech emotion recognition system that analyzes audio features to classify emotions with a simple interface for testing.
Key Features: Simple VAE architecture with encoder/decoder Synthetic music data generation for training Interactive training with progress tracking Music generation from latent space sampling Audio conversion and playback Downloadable audio files
AI-generated audio summarisation pipeline — Whisper transcription, LLM key-insight extraction, and structured spoken summaries with TTS playback and Streamlit interface.
Audio file processing pipeline with GPT-4-powered error diagnosis — detects codec issues, sample rate mismatches, and corruption artefacts with automated remediation suggestions.
Neural TTS and voice-cloning application using XTTS/VITS. Supports 3–30 s reference audio for speaker adaptation, real-time pitch/speed control, and WAV/MP3 export.
Music harmony AI — chord progression analysis with Roman numeral labelling, voice leading checker, style-conditioned progression generation (Baroque/Jazz/Pop), and MIDI export via music21.
Add a description, image, and links to the audio-ml topic page so that developers can more easily learn about it.
To associate your repository with the audio-ml topic, visit your repo's landing page and select "manage topics."