Modular Learn-to-Rank Production Toolkit
RankForge fills the gap between LTR model libraries (XGBoost, LightGBM) and production deployment. It provides the full orchestration layer: pluggable feature stores, replay-based backtesting, A/B test harness, and FastAPI serving — all with a consistent, model-agnostic interface.
pip install rankforge # core only
pip install rankforge[serve] # + FastAPI serving
pip install rankforge[redis] # + Redis feature store
pip install rankforge[serve,redis] # everythingfrom rankforge import XGBoostRanker, InMemoryFeatureStore, ReplayEngine, ABTest
# Train
model = XGBoostRanker(n_estimators=100)
model.train(train_df, label_col="relevance", group_col="query_id")
# Evaluate
metrics = model.evaluate(test_df, "relevance", "query_id")
print(metrics) # {"ndcg@10": 0.82, "map": 0.74, "mrr": 0.91}
# Feature store
store = InMemoryFeatureStore()
store.hydrate_static_scores(product_df, id_col="product_id", score_cols=["popularity"])
# Backtest
engine = ReplayEngine(historical_logs_df)
report = engine.evaluate(model, segments=["device_type", "user_cohort"])
print(report.summary())
# A/B test
ab = ABTest(control=model_v1, treatment=model_v2)
report = ab.run(eval_df)
print(report.summary())| Module | Description |
|---|---|
rankforge.core |
RankModel interface, XGBoostRanker, LightGBMRanker, IR metrics |
rankforge.features |
InMemoryFeatureStore, RedisFeatureStore, static score hydration |
rankforge.backtest |
ReplayEngine — replay logs, segment analysis, lift vs baseline |
rankforge.experiment |
ABTest — query-level splits, Welch's t-test, multi-metric |
rankforge.serve |
FastAPI app factory for production serving |
Apache 2.0