Parlai tuto

Fonctionnement minimal :

Créer une tâche -> build.py pour télécharger le corpus et l'installer dans le répertoire /data. Si déjà présent, ne rien faire. -> agents.py qui crée les teachers : ce sont eux qui parsent les données (en mode entraînement, ils stockent le text + les labels) et qui se mettent automatiquement en mode question-answering, ou dialogue. Ce sont eux qui vont jouer le rôle de l'utilisateur pour entraîner les agents. ParlAI fournit tout un tas de teachers, le nôtre est une classe fille de ParlAIDIalogTeacher, je te laisse regarder la doc.

Utiliser ParlAI -> dans le répertoire /examples il y'a tout un tas de scripts (relativement bourrins) qui servent à comprendre comment marche le système de monde. display_data est un bon exemple :

** On crée un monde qui, par défaut, s'accorde avec nos données (class DialogueWorld (World)), cela implémente la méthode world.parlai() -> agent1 ( le teacher).act() / agent2 (l'apprenant).observe() / agent2.act / agent1.observe.
** les méthodes act et observe sont propres à chaque agent. Dans le cas du teacher, nous utilisons celles par défaut. Là où c'est intéressant c'est pour l'apprenant.
** Pour charger un agent apprenant, ParlAI implémente un certain nombre de modèles d'agents. Nous utilisons pour display_data l'agent ReapeatLabel Agent, qui lit ce que lui donne le teacher et le recrache tel un perroquet des îles.

Dans le script train_data.py nous utilisons l'exemple MemNNAgent (aka memoryNetwork Agent) qui va se comporter d'une façon particulière par rapport à d'autres agents d'apprentissage. Les résultats de l'apprentissage sont sauvegardés dans un fichier de modèle.

** dans le main, une boucle while not "EPOCH DONE" enveloppe world.parlai. Tant que tous les dialogues ne sont pas appris, cela continue.
** Le framework derrière tous les agents d'apprentissage est PyTorch, il y'a d'ailleurs une classe TorchAgent dont la plupart des agents héritent. Dont le MemNN Agent.

Tu peux trouver les scripts que j'ai faits entièrement from scratch dans /parlai/tasks/cologne. les données sont dans /data/cologne

ParlAI (pronounced “par-lay”) is a python framework for sharing, training and testing dialogue models, from open-domain chitchat to VQA (Visual Question Answering).

Its goal is to provide researchers:

80+ popular datasets available all in one place, with the same API, among them PersonaChat, DailyDialog, Wizard of Wikipedia, Empathetic Dialogues, SQuAD, MS MARCO, QuAC, HotpotQA, QACNN & QADailyMail, CBT, BookTest, bAbI Dialogue tasks, Ubuntu Dialogue, OpenSubtitles, Image Chat, VQA, VisDial and CLEVR. See the complete list here.
a wide set of reference models -- from retrieval baselines to Transformers.
a large zoo of pretrained models ready to use off-the-shelf
seamless integration of Amazon Mechanical Turk for data collection and human evaluation
integration with Facebook Messenger to connect agents with humans in a chat interface
a large range of helpers to create your own agents and train on several tasks with multitasking
multimodality, some tasks use text and images

ParlAI is described in the following paper: “ParlAI: A Dialog Research Software Platform", arXiv:1705.06476 or see these more up-to-date slides.

See the news page for the latest additions & updates, and the website http://parl.ai for further docs.

Installing ParlAI

ParlAI currently requires Python3 and Pytorch 1.1 or newer. Dependencies of the core modules are listed in requirement.txt. Some models included (in parlai/agents) have additional requirements.

Run the following commands to clone the repository and install ParlAI:

git clone https://github.com/facebookresearch/ParlAI.git ~/ParlAI
cd ~/ParlAI; python setup.py develop

This will link the cloned directory to your site-packages.

This is the recommended installation procedure, as it provides ready access to the examples and allows you to modify anything you might need. This is especially useful if you if you want to submit another task to the repository.

All needed data will be downloaded to ~/ParlAI/data, and any non-data files if requested will be downloaded to ~/ParlAI/downloads. If you need to clear out the space used by these files, you can safely delete these directories and any files needed will be downloaded again.

Documentation

Examples

A large set of scripts can be found in parlai/scripts. Here are a few of them. Note: If any of these examples fail, check the requirements section to see if you have missed something.

Display 10 random examples from the SQuAD task

python -m parlai.scripts.display_data -t squad

Evaluate an IR baseline model on the validation set of the Personachat task:

python -m parlai.scripts.eval_model -m ir_baseline -t personachat -dt valid

Train a single layer transformer on PersonaChat (requires pytorch and torchtext). Detail: embedding size 300, 4 attention heads, 2 epochs using batchsize 64, word vectors are initialized with fasttext and the other elements of the batch are used as negative during training.

python -m parlai.scripts.train_model -t personachat -m transformer/ranker -mf /tmp/model_tr6 --n-layers 1 --embedding-size 300 --ffn-size 600 --n-heads 4 --num-epochs 2 -veps 0.25 -bs 64 -lr 0.001 --dropout 0.1 --embedding-type fasttext_cc --candidates batch

Code Organization

The code is set up into several main directories:

core: contains the primary code for the framework
agents: contains agents which can interact with the different tasks (e.g. machine learning models)
scripts: contains a number of useful scripts, like training, evaluating, interactive chatting, ...
tasks: contains code for the different tasks available from within ParlAI
mturk: contains code for setting up Mechanical Turk, as well as sample MTurk tasks
messenger: contains code for interfacing with Facebook Messenger
zoo: contains code to directly download and use pretrained models from our model zoo

Support

If you have any questions, bug reports or feature requests, please don't hesitate to post on our Github Issues page.

The Team

ParlAI is currently maintained by Emily Dinan, Dexter Ju, Margaret Li, Spencer Poff, Pratik Ringshia, Stephen Roller, Kurt Shuster, Eric Michael Smith, Jack Urbanek, Jason Weston, and Mary Williamson.

Former major contributors and maintainers include Alexander H. Miller, Will Feng, Adam Fisch, Jiasen Lu, Antoine Bordes, Devi Parikh, Dhruv Batra, Filipe de Avila Belbute Peres and Chao Pan.

Citation

Please cite the arXiv paper if you use ParlAI in your work:

@article{miller2017parlai,
  title={ParlAI: A Dialog Research Software Platform},
  author={{Miller}, A.~H. and {Feng}, W. and {Fisch}, A. and {Lu}, J. and {Batra}, D. and {Bordes}, A. and {Parikh}, D. and {Weston}, J.},
  journal={arXiv preprint arXiv:{1705.06476}},
  year={2017}
}

License

ParlAI is MIT licensed. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 5,507 Commits
.circleci		.circleci
.github		.github
data/PMC/data		data/PMC/data
docs		docs
example_parlai_internal		example_parlai_internal
examples		examples
latex		latex
parlai.egg-info		parlai.egg-info
parlai		parlai
projects		projects
tests		tests
website		website
.coveragerc		.coveragerc
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Doxyfile		Doxyfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
NEWS.md		NEWS.md
README.md		README.md
autoformat.sh		autoformat.sh
codecov.yml		codecov.yml
conftest.py		conftest.py
diffsdox.txt		diffsdox.txt
doxyconf		doxyconf
doxygen_config.cfg		doxygen_config.cfg
mypy.ini		mypy.ini
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
setup.py		setup.py
test.txt		test.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Parlai tuto

Fonctionnement minimal :

Utiliser ParlAI -> dans le répertoire /examples il y'a tout un tas de scripts (relativement bourrins) qui servent à comprendre comment marche le système de monde. display_data est un bon exemple :

Dans le script train_data.py nous utilisons l'exemple MemNNAgent (aka memoryNetwork Agent) qui va se comporter d'une façon particulière par rapport à d'autres agents d'apprentissage. Les résultats de l'apprentissage sont sauvegardés dans un fichier de modèle.

Tu peux trouver les scripts que j'ai faits entièrement from scratch dans /parlai/tasks/cologne. les données sont dans /data/cologne

Installing ParlAI

Documentation

Examples

Code Organization

Support

The Team

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Parlai tuto

Fonctionnement minimal :

Utiliser ParlAI -> dans le répertoire /examples il y'a tout un tas de scripts (relativement bourrins) qui servent à comprendre comment marche le système de monde. display_data est un bon exemple :

Dans le script train_data.py nous utilisons l'exemple MemNNAgent (aka memoryNetwork Agent) qui va se comporter d'une façon particulière par rapport à d'autres agents d'apprentissage. Les résultats de l'apprentissage sont sauvegardés dans un fichier de modèle.

Tu peux trouver les scripts que j'ai faits entièrement from scratch dans /parlai/tasks/cologne. les données sont dans /data/cologne

Installing ParlAI

Documentation

Examples

Code Organization

Support

The Team

Citation

License

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages