HyperRAG: Reasoning N-ary Facts over Hypergraphs for Retrieval Augmented Generation
Accepted at The Web Conference (WWW) 2026
HyperRAG addresses a fundamental limitation of conventional RAG systems: the inability to capture N-ary facts — relationships that involve more than two entities simultaneously (e.g., "Person A received Award B from Organization C in Year D").
Instead of decomposing such facts into binary edges (losing relational context), HyperRAG encodes them as hyperedges in a hypergraph, where a single edge can connect any number of nodes. This structure enables:
- Faithful representation of complex, multi-entity facts without information loss
- Structured multi-hop reasoning by traversing hyperedges across the graph
- Precise retrieval via a trained MLP-based retriever (HyperRetriever) that scores candidate hyperedges given a query
The system consists of two core modules:
| Module | Role | Key Mechanism |
|---|---|---|
| HyperMemory | Memory-Guided Beam Retriever | Leverages the LLM’s parametric memory to guide beam search over n-ary facts without extra training. |
| HyperRetriever | Learnable Relational Retriever | Uses a trained MLP to fuse structural and semantic signals for adaptive, query-aware chain extraction. |
The codebase builds on HyperGraphRAG.
- Installation
- Configuration
- Datasets
- Project Structure
- Pipeline: WikiTopics (Closed Domain)
- Pipeline: Open Domain
- Evaluation
- License
- Citation
conda create -n hyperrag python=3.11.13
conda activate hyperrag
pip install -r requirements.txtCreate a config.json file in the project root with your API credentials:
{
"openai_api_key": "YOUR_OPENAI_API_KEY"
}WikiTopics is a closed-domain multi-hop QA dataset organized into 11 topic domains:
| Domain | Key | Domain | Key |
|---|---|---|---|
| Art | art |
Infrastructure | infra |
| Award | award |
Location | loc |
| Education | edu |
Organization | org |
| Health | health |
People | people |
| Science | sci |
Sport | sport |
| Taxonomy | tax |
Each domain provides both a Knowledge Graph (KG) version and a Natural Language (NLG) version. The main method uses the NLG version; the KG version is used for ablation studies.
| Version | Download |
|---|---|
| Full Dataset — KG | 🔗 WikiTopics KG |
| Full Dataset — NLG | 🔗 WikiTopics NLG |
| Sampled Dataset (1%) | dataset/wikitopics_test_sampled |
After downloading the full WikiTopics dataset, place it in the dataset/ folder:
dataset/
├── open_domain_dataset/
├── open_domain_splitted/
├── wikitopics_test_sampled/
└── WikiTopicsQE_NLG/ <-- place the full WikiTopics dataset here
| Split | Path |
|---|---|
| Full dataset | dataset/open_domain_dataset |
| Pre-split for training/testing | dataset/open_domain_splitted |
Open domain includes: 2WikiMultiHopQA, HotpotQA, and MuSiQue.
.
├── dataset/
│ ├── open_domain_dataset/
│ ├── open_domain_splitted/
│ ├── wikitopics_test_sampled/
│ └── WikiTopicsQE_NLG/
├── evaluate/
│ ├── qa_eval_EM_F1.py # Open domain evaluation
│ └── qa_eval_MRR_HIT.py # Closed domain evaluation
├── HyperMemory/ # Graph construction + memory-based QA (WikiTopics)
├── HyperMemory_open/ # Memory-based QA (open domain)
├── HyperMemory_token/ # Token efficiency variant
├── HyperRetriever/ # MLP retriever QA (WikiTopics)
├── HyperRetriever_open/ # MLP retriever QA (open domain)
├── HyperRetriever_token/ # Token efficiency variant
├── HyperRetriever_token_kg/ # Token efficiency variant (KG input)
└── results/ # Auto-generated inference outputs
Replace {DOMAIN} with one of the 11 domain keys (e.g., art, award, edu, ...).
This step constructs the hypergraph from the WikiTopics NLG corpus. The hypergraph encodes N-ary facts as hyperedges and is shared by both HyperMemory and HyperRetriever. Outputs are written to an expr/ directory in the project root.
cd HyperMemory
python wikitopics_construct.py {DOMAIN}Output:
expr/{DOMAIN}/— contains the hypergraph structure, node/edge embeddings, and associated index files used in all downstream steps.
Run question answering directly over the constructed hypergraph using the memory-based approach. No additional training is required.
# From HyperMemory/
python wikitopics_query.py {DOMAIN}Output:
results/HyperMemory/{DOMAIN}_output.jsonl
HyperRetriever improves retrieval precision by training a lightweight MLP on top of hypergraph embeddings. Complete both sub-steps before running inference.
cd HyperRetriever
python retrieve/prepare.py {DOMAIN}python retrieve/train.py {DOMAIN}Output: A trained MLP checkpoint saved under
expr/for the specified domain.
Run question answering using the trained retriever.
# From HyperRetriever/
python wikitopics_query.py {DOMAIN}Output:
results/HyperRetriever/{DOMAIN}_output.jsonl
The open domain pipeline follows the same logic as WikiTopics. Use the modules ending with _open.
All inference scripts automatically write results to results/{MODULE}/, with filenames ending in _output.jsonl.
Use MRR (Mean Reciprocal Rank) and Hit Rate to evaluate answer ranking quality:
python evaluate/qa_eval_MRR_HIT.py --model {OUTPUT_FOLDER} {DATASET}Example:
python evaluate/qa_eval_MRR_HIT.py --model HyperMemory artUse Exact Match (EM) and F1 Score to evaluate answer extraction quality:
python evaluate/qa_eval_EM_F1.py --model_name {OUTPUT_FOLDER} {DATASET}| Dataset Type | Metric | Script |
|---|---|---|
| Closed domain (WikiTopics) | MRR, Hit Rate | qa_eval_MRR_HIT.py |
| Open domain (2Wiki, HotpotQA, MuSiQue) | Exact Match, F1 | qa_eval_EM_F1.py |
This project is licensed under the MIT License — see the LICENSE file for details.
If you use HyperRAG in your research, please cite:
@inproceedings{lien2026hyperrag,
title={HyperRAG: Reasoning N-ary Facts over Hypergraphs for Retrieval Augmented Generation},
author={Wen-Sheng Lien, Yu-Kai Chan, Hao-Lung Hsiao, Bo-Kai Ruan, Meng-Fen Chiang, Chien-An Chen, Yi-Ren Yeh and Hong-Han Shuai},
booktitle={The Web Conference (WWW)},
year={2026}
}- arXiv Paper link
- Installation Instructions
- Integrate token counter function into modules