Welcome to the WordMix! This is a simple command-line game where you mix two words together by combining their word embeddings. The game uses pre-trained GloVe embeddings to find a word that is semantically closest to the combination of your chosen words.
In this game, you'll be presented with a list of 10 random nouns. You select any two words by their numbers, and the game will "mix" them by adding their vector embeddings together. It then finds and displays the word whose embedding is closest to this new vector. It's a fun way to explore semantic relationships between words!
- Uses pre-trained GloVe word embeddings for semantic word representation.
- Randomly selects words from a dictionary of English nouns.
- Simple command-line interface.
- Educational tool to understand word embeddings and vector arithmetic.
-
Clone the Repository
git clone https://github.com/frrobledo/WordMix cd WordMix -
Download GloVe Embeddings
- Download the pre-trained GloVe embeddings from the GloVe website.
- Specifically, download the
glove.6B.zipfile (~822 MB). - Extract the file and place
glove.6B.100d.txtin theWordMixdirectory.
-
Create a Virtual Environment (Optional but Recommended)
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install Dependencies
pip install -r requirements.txt
Alternatively, install dependencies manually:
pip install numpy nltk scikit-learn
-
Download NLTK Data Files
The script will attempt to download necessary NLTK data files on the first run. If you encounter issues, you can download them manually:
import nltk nltk.download('wordnet') nltk.download('omw-1.4')
-
Run the Game
python word_mixing_game.py
-
Gameplay
- The game will display 10 random words numbered from 1 to 10.
- Enter two numbers corresponding to the words you want to mix.
- The game will output the resulting word.
- Enter
'Q'or'q'at any time to quit the game.
- Python 3.6 or higher
- NumPy
- NLTK
- scikit-learn
- GloVe Embeddings (
glove.6B.100d.txt)
- GloVe: Global Vectors for Word Representation by Stanford NLP Group.
- NLTK: Natural Language Toolkit
