An introduction to DSPy - a framework for programmatically solving tasks with large language models and vision language models.
Author: Igor Zubrycki
Email: igorzubrycki@gmail.com
DSPy is a framework that allows you to programmatically solve tasks with language models through:
- Signatures: Declarative specifications of input/output behavior
- Modules: Reusable components that can be composed together
- Optimizers (Teleprompters): Automatic optimization of prompts and examples
- Tool Integration: Seamless connection with external APIs and databases
pip install dspy opikThis repository contains a comprehensive tutorial covering:
- Using
dspy.Predictfor simple predictions - Working with signatures and type constraints
- Handling different input types (text, numbers, images)
- Creating custom signatures with detailed descriptions
- Optional output fields
- Type annotations and constraints
- Switching between different language models
- Cost considerations and optimization
- Using specialized models for specific tasks
- Building reusable components
- Combining strategies to fit your needs
- Chain of thought reasoning
- Connecting external APIs and services
- Working with databases and vector stores
- Implementing retrieval strategies
- Reverse index search vs embedding-based search
- Integration with vector databases (ChromaDB)
- Using LangChain tools for file formats and databases
- Built-in FAISS support with
dspy.Embeddings
- Dataset preparation and requirements
- Metrics design and evaluation
- Using teleprompters for automatic optimization
- Judge models for evaluation
- Integration with MLOps tools (Opik, MLFlow)
- Tracing information flow
- Cost and performance monitoring
- Jupyter Notebook: Interactive tutorial with hands-on examples
- PDF Presentation: Comprehensive presentation slides
- README.md: This overview and guide
- Try it in Colab: Click the Colab badge above to run the notebook directly in Google Colab
- Local Setup: Clone this repository and install dependencies
- Follow Along: Work through the notebook examples step by step
- Multi-modal Support: Work with text, numbers, and images
- Flexible Input/Output: Handle various data types and formats
- Optimization: Automatic prompt and example optimization
- Integration: Connect with external tools and databases
- Monitoring: Track costs, performance, and information flow
import dspy
# Configure your model
model = dspy.LM("gemini/gemini-2.5-flash-lite", api_key=your_api_key)
dspy.configure(lm=model)
# Simple prediction
sum_of_numbers = dspy.Predict('numbers -> sum_of_numbers')
result = sum_of_numbers(numbers="12, 13, 15")
print(result)MIT License - see LICENSE file for details. Please pop me a message if you want to use the slides in your presentations.