Cascading runtime for AI agents. Optimize cost, latency, quality, and policy decisions inside the agent loop.
-
Updated
Mar 12, 2026 - Python
Cascading runtime for AI agents. Optimize cost, latency, quality, and policy decisions inside the agent loop.
Confidence-triggered cascaded inference for real-time object detection under compute constraints, evaluated on cross-dataset pedestrian detection.
Add a description, image, and links to the model-cascading topic page so that developers can more easily learn about it.
To associate your repository with the model-cascading topic, visit your repo's landing page and select "manage topics."