Skip to content

VernonOY/replicalpha

Repository files navigation

replicalpha

English · 简体中文

CI Python License: MIT Coverage Release

Research-reproduction Agent: PDF → factor code → backtest → Red Team Validator → reproducibility score.

replicalpha takes a quant research PDF and produces a complete reproduction bundle in one command — structured paper metadata, executable Python factor code, backtest results, automated Red Team findings, and a plain-English reproducibility score.

replicalpha — A timeline of every quant paper you've read, and which ones still work

The product

A personal timeline for every quant paper you've ever read, with one verdict per paper: does the factor still work?

📈 Timeline · every paper, every verdict, in one place

Timeline

Multi-lane chronological view, color-coded by reproducibility verdict. Lanes by factor family (momentum / reversal / value / quality / volatility / liquidity / other). Filter by verdict, universe, or family. Hover for paper details.

🎯 Verdict · the killer page

Verdict page

For each paper: a one-sentence natural-language verdict, a score gauge, paper-claim vs reproduction side-by-side, cumret chart. One-click "Open in IDE" to inspect or fix the generated factor code. Up/down arrow keys to scrub through your timeline.

💼 Portfolio · paper-trade your reproduced factors

Portfolio

Daily-mark equity curve with CSI 300 benchmark overlay, drawdown panel, allocations table. Strategy +27% vs benchmark −0.14% on the simulated 252-session window above. Export rebalance orders to IBKR / Tiger / Futu CSV.

🔭 Monitor · live IC decay alerts

Monitor

Watch any reproduced factor live. Alert when 30-day IC moving average crosses your threshold (decay, sign-flip, sample concentration). Sparkline shows raw IC + 7-pt MA against zero baseline.

🤖 Agent · find papers from arxiv / SSRN / Scholar

Agent search

Don't have a PDF? Describe a research direction in plain English: "Latest momentum factors in CSI 500 (2024)" / "Post-publication decay of value anomalies". Agent ranks candidates and lets you pick which to reproduce.

📚 Library · your archive, queryable

Library

26 papers in this snapshot. Sort by claimed IC, reproduced IC, score, family, year. Filter, multi-select, batch-rerun, export.

🛠️ IDE workspace · Cursor for quant research

IDE workspace

Every reproduction is a full editable workspace: PDF + Claude-generated factor.py / backtest.py / redteam.py, terminal, and an in-context agent that knows the verdict, the data, and the code. Quick-actions surface the right next move per verdict ("Why is the verdict not green?" / "Fix the sign-flip" / "Add 1/99 winsorization" / "Audit for look-ahead leakage").


Bilingual UI (EN / 中文) · dark mode default · keyboard-first (⌘K for everything).

Live demo — real research paper on real A-share data

Reproducing Zeng & Liu (2016) "Momentum and Reversal Effects on the Chinese Stock Market" on CSI 300 top-30 with Tushare Pro daily data, 2022-01 to 2024-12:

replicalpha run — real A-share demo

Headline finding: the paper's claimed reversal effect does not reproduce

Paper claim (2010-2016, 554 stocks) Our reproduction (2022-2024, CSI 300 top 30)
6-month reversal IC +0.013 −0.022
Sign positive reversal sign-flipped → momentum
Reproducibility score 0.00 (weak; sign mismatch)

The Red Team validator also fires 1 warning (sample concentration). This is exactly the kind of finding replicalpha is built to surface — a published A-share factor whose sign flipped on a later period and tighter universe.

Charts

Cumulative long-short return Rolling IC Drawdown
cumret ic dd

Full generated research report: docs/images/real-demo/report.md. The extracted (and manually polished) ResearchCard: docs/images/real-demo/research_card.json.

Reproducing the demo yourself

uv sync --extra demo                        # installs tushare + matplotlib
export TUSHARE_TOKEN=... OPENAI_API_KEY=...
uv run python scripts/generate_real_demo.py  # auto-downloads the paper PDF

First run pulls ~4 years × 30 tickers from Tushare (~1 min) and calls OpenAI once to extract the ResearchCard (~10s). Subsequent runs use on-disk caches — no API calls, regenerates charts in seconds.

Quickstart on your own PDF + your own data

uv run replicalpha run your-paper.pdf --data your-csv.csv \
    --out ./out --start 2022-01-03 --end 2024-12-31

See below for the CLI and DataAdapter Protocol.

Synthetic-data wiring test (no API key, fastest path)

For a 2-second smoke test on bundled synthetic data, see docs/images/demo-terminal.svg and docs/images/demo-report.md. This is what CI uses.

Pipeline

flowchart LR
    PDF[📄 Paper PDF] --> EXT[paper2alpha<br/>PyMuPDF + LLM]
    EXT --> RC[ResearchCard<br/>JSON schema]
    RC --> CG[DSL interpreter<br/>+ LLM fallback]
    CG --> QT[qtype lint<br/>look-ahead check]
    QT --> BT[pandas backtest<br/>quintile IC]
    BT --> RT[Red Team<br/>5 checks]
    RT --> SC[Reproducibility<br/>score]
    SC --> MD[📝 report.md]

    classDef in fill:#e3f2fd,stroke:#1976d2,color:#0d47a1
    classDef core fill:#fff3e0,stroke:#f57c00,color:#e65100
    classDef out fill:#e8f5e9,stroke:#388e3c,color:#1b5e20
    class PDF in
    class EXT,RC,CG,QT,BT,RT,SC core
    class MD out
Loading

Quick start

uv sync
export OPENAI_API_KEY=sk-...   # required for PDF → ResearchCard extraction
uv run replicalpha run path/to/paper.pdf \
    --out ./out \
    --data path/to/market_data.csv \
    --start 2022-01-03 --end 2024-12-31

Outputs under ./out/:

  • research_card.json — structured paper metadata (paper2alpha schema)
  • factor_<name>.py — executable Python factor (qtype-clean)
  • backtest.json — IC time series + aggregate stats
  • validator.json — 5-check Red Team findings
  • reproducibility.json — numeric score + interpretation
  • report.md — aggregated markdown research report

Offline demo (no API key needed)

# Generate a mock card JSON
cat > /tmp/card.json <<'JSON'
{
  "source": "demo",
  "factors": [{
    "name": "ma20", "chinese_name": "20日均线", "definition": "d",
    "formula": "rolling_mean(close, 20)", "data_fields": ["close"],
    "params": {"lookback": 20}, "universe": "全A",
    "reported_metrics": {"ic_mean": 0.045, "backtest_period": "2022~2024"}
  }]
}
JSON

uv run replicalpha run tests/cases/demo.pdf \
    --out ./demo-out \
    --data tests/cases/sample_market_data.csv \
    --start 2022-03-01 --end 2022-12-31 \
    --extractor-mock /tmp/card.json

Expected terminal output:

replicalpha run — demo.pdf

✓ run r-1a2b3c4d finished
  output: ./demo-out
  codegen (dsl) ✓
  backtest: IC 0.0123  cumret +2.45%  maxDD 4.21%  Sharpe 0.87
  reproducibility: 0.42  — weak reproduction: claimed IC = 0.0450,
                          reproduced = 0.0123 (72.7% deviation).
  red team findings: 2 warning, 1 critical

See examples/reproduce_demo.md for the full walkthrough.

Red Team checks (v0.1)

# Check Fires when
1 overfitting_hint lookback is a non-round value (suggests tuning)
2 small_cap_exposure held-leg median market cap < 0.5× universe median
3 data_leakage generated code fails qtype (look-ahead / future function)
4 sample_concentration single year contributes > 50% of cumulative IC
5 factor_redundancy paper declares multiple factors with near-identical formulas

HTTP server

RUNS_ROOT=./runs REPLICALPHA_DATA_CSV=./market.csv \
uv run uvicorn replicalpha.server.main:app --port 8000

Endpoints:

  • POST /runs — upload PDF, start pipeline, returns run_id
  • GET /runs/{run_id} — structured PipelineReport
  • GET /runs/{run_id}/report — markdown report (text/markdown)

Bring your own data

replicalpha ships a 20-ticker × 500-day synthetic sample for CI and demos. For real results, implement the DataAdapter Protocol (src/replicalpha/core/data.py) against your data source (Wind / Tushare / yfinance / Bloomberg / …). The adapter's contract is 3 methods: get_price(field, start, end, universe), get_trading_days(start, end), get_metadata().

Known limitations (v0.1)

  • Factor formulas outside the DSL whitelist (pct_change / rolling_mean / rolling_std / rolling_sum / corr / rank / zscore + arithmetic) fall back to LLM codegen or produce a stub.
  • Minimal pandas backtest (quintile long-short, weekly rebalance). No transaction costs, no slippage, no position sizing constraints.
  • Red Team is computational only; semantic LLM critique is v0.2.
  • No real-time / intraday / trading-execution features (roadmap v1.5 → v2.0).

Roadmap

  • v0.2: LLM-based Red Team critique, transaction cost modeling, attribution (Barra factors), robustness (walk-forward / bootstrap), experiment memory (SQLite).
  • v1.5: real-time factor monitoring, IC decay alerts, intraday signals.
  • v2.0: broker API integration, paper trading → live.

Vendored components

Component Source Purpose
paper2alpha VernonOY/paper2alpha PDF → ResearchCard extraction
qtype VernonOY/qtype static lint for look-ahead / future-function bugs

Licenses preserved verbatim in LICENSE-VENDORED.md.

Quality

  • 52 tests (unit + end-to-end subprocess) · 84% coverage · CI on 3.11 + 3.12
  • ruff check · ruff format --check · mypy --strict all green
  • See tests/unit/ for fast per-module tests, e2e/ for full-pipeline subprocess tests

Support

Bug reports, questions, feedback — open a GitHub Issue or start a Discussion. See SUPPORT.md.

License

MIT — see LICENSE. Vendored components retain upstream licenses.

About

Research-reproduction Agent: PDF → factor code → backtest → Red Team → reproducibility score. Part of the alpha-kit stack.

Topics

Resources

License

MIT, Unknown licenses found

Licenses found

MIT
LICENSE
Unknown
LICENSE-VENDORED.md

Stars

Watchers

Forks

Packages

 
 
 

Contributors