Low-Rank CNN

Trains low-rank CNNs from raw speech using Keras/Tensorflow, with inputs from Kaldi directories.

Features

Trains CNNs from Kaldi GMM system
Works with standard Kaldi data and alignment directories
Decodes test utterances in Kaldi style

Dependencies

Python 3.4+
Keras with Tensorflow/Theano backend
Kaldi (obtained using git)

Using the Code

Train a GMM system in Kaldi.
Place steps_kt and run_*.sh in the working directory.
Apply the patch compute-raw-feats.patch to Kaldi. To do this:
```
. ./path.sh ## To get $KALDI_ROOT environment variable.
mv compute-raw-feats.patch $KALDI_ROOT/
cd $KALDI_ROOT/
git apply compute-raw-feats.patch
cd src/
make depend ## [-j 4]
make ## [-j 4]
```
Note: This creates a new executable compute-raw-feats in src/featbin/ directory of Kaldi. It does not alter any of the existing Kaldi tools.
Extract raw features using extract.sh.
Configure and run run_*.sh. run_rawcnn.sh trains triphone models. Provide model architecture as an argument. See steps_kt/model_architecture.py for valid options. Optionally, provide a CNN directory to initialise the model weights from. The model architecture is expected to be the same, except the output layer. This feature is useful to initialise a triphone CNN from a monophone CNN. run_rawcnn_mono.sh trains monophone models. Model architecture is its only argument. After training a CNN, it computes forced alignments and re-trains them. This expectation-maximisation is performed for two iterations to get a better modelling.

Code Components

train*_rawcnn.py is the Keras training script.
Model architecture can be configured in model_architecture.py.
dataGeneratorSRaw.py provides an object that reads Kaldi data and alignment directories in batches and retrieves mini-batches for training.
nnet-forward-norm-arch.py passes test features through the trained CNNs and outputs log posterior probabilities in Kaldi format.
kaldiIO.py reads and writes Kaldi-type binary features.
decode_norm_arch.py is the decoding script.
align_norm_arch.sh is the alignment script.
compute_priors.py computes priors.

Training Schedule

The script uses stochastic gradient descent with 0.5 momentum. It starts with a learning rate of 0.1 for a minimum of 5 epochs. Whenever the validation loss reduces by less than 0.002 between successive epochs, the learning rate is halved. Halving is performed for a total of 18 times.

Contributors

Idiap Research Institute

Authors: S. Pavankumar Dubagunta, Vinayak Abrol and Mathew Magimai-Doss.

License

GNU GPL v3

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
steps_kt		steps_kt
COPYING		COPYING
README.md		README.md
compute-raw-feats.patch		compute-raw-feats.patch
extract.sh		extract.sh
raw.conf		raw.conf
run_rawcnn.sh		run_rawcnn.sh
run_rawcnn_mono.sh		run_rawcnn_mono.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Low-Rank CNN

Features

Dependencies

Using the Code

Code Components

Training Schedule

Contributors

License

About

Uh oh!

Releases

Packages

Languages

License

idiap/LR-CNN

Folders and files

Latest commit

History

Repository files navigation

Low-Rank CNN

Features

Dependencies

Using the Code

Code Components

Training Schedule

Contributors

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages