Install poetry https://python-poetry.org/
Run "poetry install" to install dependencies and setup the venv.
To add poetry to the path (if it's not added automatically) run:
echo 'export PATH="/home/jovyan/.local/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc
Install on VM:
curl -sSL https://install.python-poetry.org | python3 -
Otherwise, this also works to run poetry once it is installed (on the vm):
~/.local/share/pypoetry/venv/bin/poetry
PyCharm Venv:
Adding the newly generated Poetry venv in PyCharm: https://www.jetbrains.com/help/pycharm/poetry.html#existing-poetry-environment
To download datasets and train the tokenizers run:
poetry run train-tokenizers
poetry run pytest tests
poetry run train-tekenizers
poetry run train-model