Skip to content

[DMP 2024]: Auto Subtitler for Indian Languages #2

@ghost

Description

Ticket Contents

As part of the BIRD initiative , we aim to create a tool which can speed up the adoption of Same Language Subtitling (SLS) among the content producers for the entire country.
This will ensure that 200M weak readers and 30M readers with accessibility to get regular reading exposure with content having SLS.

This tool will create SRT files by taking a video file and its text file.
We aim for the tool to support the following languages : Tamil, Telugu, Kannada for now.

Goals & Mid-Point Milestone

Goal 1:
Achieve 60% accuracy in timing accuracy of SRT files in Tamil Language.
Achieve 60% accuracy in timing accuracy of SRT files in Telugu Language.
Achieve 60% accuracy in timing accuracy of SRT files in Kannada Language.

Goal 2:
Achieve 70% accuracy in timing accuracy of SRT files in Tamil Language.
Achieve 70% accuracy in timing accuracy of SRT files in Telugu Language.
Achieve 70% accuracy in timing accuracy of SRT files in Kannada Language.

Goal 3:
Achieve 80% accuracy in timing accuracy of SRT files in Tamil Language.
Achieve 80% accuracy in timing accuracy of SRT files in Telugu Language.
Achieve 80% accuracy in timing accuracy of SRT files in Kannada Language.

Goal 4:
Achieve 90% accuracy in timing accuracy of SRT files in Tamil Language.
Achieve 90% accuracy in timing accuracy of SRT files in Telugu Language.
Achieve 90% accuracy in timing accuracy of SRT files in Kannada Language.

The midpoint milestones will be completion of Goal 1 and Goal 2.

Setup/Installation

No response

Expected Outcome

The input will be a video file and its script in text file format. The text will be utf8 encoding.
The output will be an SRT file with timecode for each line of the script.

Acceptance Criteria

We will use the VLC media player to check the time accuracy of the generated SRT file. This will be used to verify the completion of the goals too. We will use multiple video files to check if the tool is versatile.

Implementation Details

Python or any other technical stack.

Mockups/Wireframes

No response

Product Name

Auto Subtitler for Indian Languages

Organisation Name

Planet Read

Domain

⁠Education

Tech Skills Needed

Machine Learning, Python

Mentor(s)

@arvind-planetread

Category

Accessibility, Machine Learning

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions