Skip to content

Grogu22/VoiceOps

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VoiceOps

VoiceOps is an asynchronous audio processing and transcription framework built with Django REST Framework, Celery, Redis, and PostgreSQL.

It is designed for API-first media workflows where transcription cannot be handled in a single request cycle. A client registers an audio asset, queues transcription, polls task state, and then consumes the resulting transcript or summary. This makes it suitable for speech data operations, transcript generation, summarization pipelines, and preparation of training data for audio ML workflows. It is built for real audio processing workloads and already integrates with GCS as the media ingestion layer.

The current pipeline covers:

  • asset registration
  • queued transcription
  • queued transcript summarization
  • task-state polling
  • queue separation for transcription and summarization workloads

Planned future work:

  • embedding pipelines for transcripts and summaries
  • semantic search and retrieval over transcript corpora
  • dataset curation workflows for audio and speech ML
  • richer orchestration for multi-stage post-processing

Documentation:

About

Async audio processing pipeline api

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages