cs542-adventure

Planning

Directly Prompting Off-The-Shelf LLMs
- Prompt engineering
- Multiple prompts (agentic)
  - Give it questions to answer to build a "train of thought", then prompt it to actually give the action to take
- Can we use reasoning models like Qwen, GPT-OSS to accomplish the above for better performance?
Fine-tuning
- LoRA, UnSloth (more control, lower resource usage), Axlotl (potentially better for beginners), TorchTune
- Fine-tuning based on game walkthroughs (Jericho provides it)
- Fine-tuning based on dataset from Q*BERT for question-answers (qa-jericho)
Q*BERT testing
Reinforcement Learning
- Input as word-embeddings? How do we do action space?
- Stable Baselines (try it)
RAG
1. Get room prompt
2. Turn into embedding vector
3. Use vector to access vector DB for relevant info we've learned
4. Build out our action prompt by taking room prompt, any relevant info from DB, and whatever final prompt we want to give the LLM
5. Semantically split room prompt and put into database
6. Send prompt to LLM and take action
- Vector databases
  - Redis
  - Postgres with pgvector
  - sqlite with sqlite-vec <- Tyson's favorite idea
  - Lots of specialized options, like FAISS, QDrant, Chroma

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
adventure		adventure
images		images
outputs		outputs
saved_outputs		saved_outputs
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
basic_llm_with_memory.out		basic_llm_with_memory.out
chatgpt_player.ipynb		chatgpt_player.ipynb
cs542_report.ipynb		cs542_report.ipynb
grpo-500-actions-result.log		grpo-500-actions-result.log
grpo-500-actions.log		grpo-500-actions.log
grpo-loss.log		grpo-loss.log
jericho-gcc-fixes.patch		jericho-gcc-fixes.patch
jericho_walkthrough.ipynb		jericho_walkthrough.ipynb
lora-l32-500-actions.log		lora-l32-500-actions.log
lora-l32-500.log		lora-l32-500.log
lora-loss-actions.log		lora-loss-actions.log
ollama_adventure.py		ollama_adventure.py
package.sh		package.sh
project_lewark_oleary.zip		project_lewark_oleary.zip
pyenv		pyenv
rag_ideas.md		rag_ideas.md
rag_requirements.txt		rag_requirements.txt
requirements.txt		requirements.txt
setup-env.sh		setup-env.sh
test_llms.ipynb		test_llms.ipynb
test_prompt_completion.ipynb		test_prompt_completion.ipynb
test_rag.ipynb		test_rag.ipynb
unsloth_example.ipynb		unsloth_example.ipynb
unsloth_requirements.txt		unsloth_requirements.txt
use-env.sh		use-env.sh