A step-by-step guide to building Retrieval-Augmented Generation (RAG) applications from scratch. This course combines theory + hands-on notebooks to help you understand, implement, and optimize RAG pipelines.
- An IDE for development with python setup (and dependencies in requirements.txt)
- Pytorch CUDA (I am running CUDA 12.6)
- Python 3.12
- What is RAG?
- Why do we need it?
- Real-world applications
- Collecting & cleaning documents
- Chunking strategies
- Preprocessing text
- What are embeddings?
- Popular embedding models
- Generating and storing embeddings
- Introduction to vector stores
- FAISS, Chroma, Pinecone, Weaviate
- Indexing and similarity search
- Dense retrieval
- Sparse retrieval (BM25)
- Hybrid retrieval
- User → Retriever → LLM → Answer flow
- Basic pipeline implementation
- First working RAG demo
- Prompt engineering
- Reranking retrieved results
- Handling long contexts
- End-to-end simple Q&A RAG app
- Run locally with Gradio
- Use LLMs running locally
- Each chapter has a notebook with explanations + code
- Follow in order for a progressive learning path
- Reusable Python scripts live in
common/
Caveat: This chatbot is a learning project only and is not production-ready. The objective of this tutorial is to build intuition on RAGS and to give enough hands on experience to learners.
This repository is created for educational purposes only as part of a course on Git. The contents of the book "Bakers Choice Recipe Book" remain the intellectual property of Graceco Limited.
Copyright © 2018 Graceco Limited. All rights reserved. www.graceco.com.ng | info@graceco.com.ng
No part of the book may be reproduced, distributed, or transmitted in any form or by any means—electronic, mechanical, photocopying, recording, or otherwise—without prior written permission from the publishers.
This repository does not grant any license or rights to use, reproduce, or distribute the copyrighted material outside the scope of the course.