Implementation of SANAS (see paper on arXiv ), a model able to dynamically adapt the architecture of a Deep Neural Network at test time for efficient sequence classification.
- Create an environment with Python 3.6
pip install -r requirements.txt
- Download the Speech command v0.01 archive.
- Extract the dataset and give the extracted folder path as
root_pathargument (defaults to./data/speech_commands_v0.01). - Implementation of the Speech Commands data processing is based on honk, credits goes to the authors!
- Speech Commands dataset paper on arXiv.
python main.py with adam speech_commands gru kwscnn static=True use_mongo=False ex_path=<path_to_save_location>/runs- If no
ex_pathis specified, logs and models will be saved under./runs
- Create json file containing the required connection informations:
{
"user": "Me",
"passwd": "MySecurePassword",
"host": "localhost",
"port": "27017",
"db": "sanas",
"collection": "runs"
}python main.py with adam speech_commands gru kwscnn static=True use_mongo=False mongo_config_path=<path_to_config>/mongo_config.jsonmongo_config_pathdefaults to./resources/mongo_credentials.json
python main.py with adam speech_commands gru kwscnn static=True use_visdom=False
- Visdom will connect to
localhost:8097by default. To specify the server, create a config file:
{
"server": "http://localhost",
"port": 8097
}python main.py with adam speech_commands gru kwscnn static=True visdom_config_path=<path_to_config>/vis_config.json
The __get_item__(self, idx) of a dataset should return a tuple (x,y) with:
xof sizeseq_len x feature_dims. For example,feature_dimsfor traditional images is(C,H,W)yof sizeseq_len.
It is possible to use the PadCollate class in the dataloader to pad each sequence to the length of the longest one in the sampled batch.