OpenViseme

A simple neural network to compute mouth shape from audio. Data and idea taken from Magicboomliu/Viseme-Classification.

This project is a complete rewrite and imrpovment over Magicboomliu/Viseme-Classification. With better data resampling, cleanup and code quality. Also it's is in active development to improve upon the original accuracy.

How to use

Download and put the DataSet folder from Viseme-Classification/DataSet into the root folder
Run prepare_data.py. This will generate a DataSet/mel.npy and DataSet/label.npy. These are the dataset used for training
Run baseline.py to get a baseline accuracy for low-effot models (using sklearn, to evaulate initial performance for different ML algorithms)
Run train.py to generate model.pth which is trained by ptroch

For me, the baseline (neural network) yields 63% accuracy on testing set

Train Accuracy: 0.6464355131983196
Test Accuracy: 0.624623871614844

While the PyTorch one yields a stable 63% accuracy (the difference is statistically significant)

Epoch: 600, Loss: 1.7374, Test Accuracy: 0.6387, Test Loss: 2.1781
Lowest Loss: 2.1706
Best test Accuracy: 0.6387

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
baseline.py		baseline.py
model.pkl		model.pkl
prepare_data.py		prepare_data.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

OpenViseme

How to use

About

Uh oh!

Releases

Packages

Languages

License

marty1885/OpenViseme

Folders and files

Latest commit

History

Repository files navigation

OpenViseme

How to use

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages