Skip to content

fuyuantan/rag-techniques

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Learning Sequence:

  1. RAG_Reformulate.py

  2. RAG_Hybrid_Retrieval.py

  3. RAG_Re-ranking.py

  4. RAG-Part2/1-RAG-PDF-Split

  5. RAG-Part2/2-RAG-LLM

  6. RAG-Part2/3-RAG-Eval

1-3 are basic RAG rechniques.
4-6 We chunk the PDF, then the retrieved text are as the context inputted into the LLM (qwen2-0.5) to get the response/answer, and finally evaluate this RAG using metrics such as hit rate.


1.Reformulate: Use BART model to reformulate.

python RAG_Reformulate.py

1

2.Hybrid Retrieval:sparse (BM25) + dense (FAISS + Sentence Transformer Embeddings)

pip install faiss-gpu
python RAG_Hybrid_Retrieval.py

2

3.Re-ranking: Use cross-encoder (ms-marco-MiniLM-L-6-v2) to count the scores of the pair data (query + initial retrieval results), and then rerank based on the scores.

python RAG_Re-ranking.py

3

About

RAG techniques include reformulate, hybrid retrieval, re-ranking.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages