RVC Data Prep: An Open-Source RVC Data Preparation Tool

a Dubverse Black initiative

Description

RVC Data Prep is an advanced tool for transforming audio/video content into isolated vocals. If a video contains multiple speakers, it will generate separate files for each one. The core functionality leverages Facebook's Demucs to isolate vocals and Pyannote embeddings to ideally identify and differentiate speakers.

Features

Isolate vocals from YouTube videos
Distinguish multiple speakers and provide separate files
Trim silences greated than 300ms from the audio
(Beta) Separate multi-singer Acapellas

Prerequisites

Before you start using this tool, ensure that you have the following installed:

Python version 3.10 or newer
Accept pyannote/segmentation-3.0 user conditions
Accept pyannote/speaker-diarization-3.0 user conditions
Create access token at hf.co/settings/tokens

How to use

Clone the repository

git clone https://github.com/dubverse-ai/rvc-data-prep.git

Change working directory, install dependencies and import the utils.py script

cd rvc-data-prep
pip install -r requirements.txt

The clean function in utils.py provides automatic processing of a given file (wav, mp3 and flac only). You need to specify different parameters depending on your needs. Parameters:

local (bool): Set this to True if you intend to give a file locally; False if you intend to create a dataset from a YouTube link.
file_path (str): This should be either a local path or a YouTube URL file depending on what you set local to be.
project_name (str): This will be the name of the project which the processed file will be saved under.
acapella_output (bool): (BETA) If this is True, the function insert blank audio segments while separating and segregrating speakers. The output files will add up in the time domain to create the original file.
single_speaker_file (bool): If True, this will flag the file as having a single speaker.
token (str): It is client secret key or token of your Hugging Face account. You would only need this if you're working with files involving multiple speakers. You can leave this blank in that case.

Here is an example to use the clean function:

from utils import clean

clean(local=False, 
      file_path="https://www.youtube.com/watch?v=someVideoId", 
      project_name="myProject", 
      acapella_output=True, 
      token="yourToken", 
      single_speaker_file=False)

In this example, we are providing a YouTube video url to file_path, setting the project_name as "myProject", and requesting for an acapella output by setting acapella_output to True. We indicate there may be more than one speaker by setting single_speaker_file to False, and pass our account token as token.

YouTube Tutorial

Examples

Input Video	Separated Files
Shahrukh Khan's Speech	Vocals
Yeh Ladka Haaye Allah - Bollywood Song	Udit Narayan's Vocals, Alka Yagnik's Vocals, Chorous, Other ambigous sounds
Perfect - Ed Sheeran Duet	Ed Sheeran's Vocals, Beyonce's Vocals

Known Issues

Messes up when there are multiple people speaking at the same time
When using acapella = True, this sometimes skips some audio segments which makes it hard to sync manually.

Contributing

We welcome contributions from anyone and everyone. Details about how to contribute, what we are looking for and how to get started can be found in our contributing guidelines.

Support

For any issues, queries, and suggestions, join our Discord server. Will be glad to help!

Future Scope

Add multispeaker Acapella support
Integrate this in the RVC workflow - base data preparation and creating AI covers
Improve the efficiencies of speaker identification using other models like Titanet

About Us

We, at Dubverse.ai, are a dedicated and passionate group of developers who have been working for over three years on generative AI with a specific emphasis on audio. We deeply believe in the potential of AI to revolutionize the fields of video, voiceover, podcasts and other media-related applications.

Our passion and dedication don't stop at development. We believe in sharing knowledge and nurturing a community of like-minded enthusiasts. That's why we maintain a deep tech blog where we talk about our latest research, development, trends in the field, and insights about generative AI and audio technologies.

Check out some of our RVC blog posts:

We are always open to hear from others who share our passion. Whether you're an expert in the field, a hobbyist, or just someone intrigued by AI and audio, feel free to reach out and connect with us.

License

RVC Data Prep is licensed under the MIT License - see the LICENSE file for details

Disclaimer: This repo is not affiliated with YouTube, Facebook AI Research, or Pyannote. All trademarks referred to are the property of their respective owners.

Acknowledgements

FaceBook Demucs, Pyannote Audio, Librosa, FFMPEG, and other audio related libraries.
The Dubverse Black Discord and the AI Hub Discord for quick and actionable feedback.

We value your feedback and encourage you to provide us with any suggestions or issues that you may encounter. Let's make this tool better together!

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RVC Data Prep: An Open-Source RVC Data Preparation Tool

Description

Features

Prerequisites

How to use

YouTube Tutorial

Examples

Known Issues

Contributing

Support

Future Scope

About Us

License

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RVC Data Prep: An Open-Source RVC Data Preparation Tool

Description

Features

Prerequisites

How to use

YouTube Tutorial

Examples

Known Issues

Contributing

Support

Future Scope

About Us

License

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages