β If you find this project helpful, please consider giving it a star! It helps others discover this learning resource and motivates us to keep improving it. Thank you! β
This project is officially participating in Hacktoberfest 2025! π
What this means for contributors:
- β All valid PRs count towards your 6 accepted PRs goal
- π Earn digital badges and exclusive Hacktoberfest 2025 swag
- π± TreeNation contribution for every 6th accepted PR (making the world greener!)
- π― Perfect for learning Change Data Capture, Event-Driven Architecture, and Real-Time Data Streaming
Hacktoberfest 2025 Details:
- π Registration: September 15 - October 31, 2025
- π Contribution Period: October 1 - October 31, 2025
- π― Goal: 6 high-quality accepted PRs
- π Rewards: Digital badges, exclusive T-shirts, and tree contributions
This application is a WebSocket-based FastAPI project that demonstrates real-time Change Data Capture (CDC) using Debezium and Kafka. It's designed as an educational resource to help developers understand:
- π Change Data Capture fundamentals
- π‘ Real-time data streaming concepts
- ποΈ Event-driven architecture patterns
- π WebSocket implementations
- π Database replication techniques
Debezium enables CDC to capture row-level changes in databases, allowing applications to respond to those changes instantly. This README provides installation, configuration, and usage details for the application.
- Prerequisites
- Installation
- Debezium Configuration
- Application Configuration
- Usage
- Features
- Project Structure
- Makefile Commands
- Documentation
- License
- Docker: Required to run containers for the application and associated services.
- Docker Compose: Manages the multi-container environment.
- Bruno API Client (optional): Used for creating and managing requests to Debezium connectors.
-
Clone the Repository:
git clone https://github.com/AndrGab/debezium.git cd debezium -
Build and Run the Containers:
make build make up
This builds the FastAPI application and starts PostgreSQL, Kafka, Zookeeper, Debezium Connect, and the FastAPI service, exposing configured ports for each service.
To enable logical replication, set the PostgreSQL Write-Ahead Logging (WAL) level to logical:
ALTER SYSTEM SET wal_level = logical;Restart the database contains after system changes
Verify the WAL level:
SELECT * FROM pg_settings WHERE name = 'wal_level';Run the following SQL commands in PostgreSQL to set up the super_heroes table for CDC:
CREATE TABLE public.super_heroes (
id serial4 NOT NULL,
"name" varchar(255) NOT NULL,
secret_identity varchar(255) NOT NULL,
powers varchar(255) NOT NULL,
CONSTRAINT super_heroes_pkey PRIMARY KEY (id)
);Insert initial data:
INSERT INTO super_heroes ("name", secret_identity, powers)
VALUES ('SuperMan', 'Clark Kent', 'flight, x-ray vision, strength, heat vision');Set the replica identity of the super_heroes table to FULL for Debezium to capture detailed row-level changes:
ALTER TABLE super_heroes REPLICA IDENTITY FULL;Use a POST request (0.0.0.0:8083/connectors) to create a source connector for the super_heroes table. You can use Bruno or another API client to send the following JSON configuration:
{
"name": "source-connector-super-heroes",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.hostname": "tester_db",
"database.port": "5432",
"database.user": "postgres",
"database.password": "postgres",
"database.dbname": "postgres",
"database.server.name": "postgres",
"table.include.list": "public.super_heroes",
"plugin.name": "pgoutput",
"slot.name": "slotheroes",
"topic.prefix": "cdc-using-debezium-super-heroes"
}
}Key configurations are located in app/settings.py and pyproject.toml, including:
- Kafka Host and Topic: Ensures alignment with Debezium's Kafka topics.
- Server Metadata: Customizable in
pyproject.tomlfor project name, version, and contact information.
-
Access the Application:
- Navigate to
http://localhost:8000in your browser. - Each client receives a unique ID and connects to a WebSocket to receive real-time messages.
- Navigate to
-
WebSocket Messaging:
- Clients connect via
/ws/{client_id}. - Database events (create, update, delete) trigger notifications across connected clients.
- Clients connect via
- Real-Time Database Monitoring: Listens for PostgreSQL changes via Kafka and broadcasts them.
- WebSocket Notifications: WebSocket connections distribute messages to all connected clients.
- User Interface: Messages display in a chat interface with styles indicating operation type.
- Educational Focus: Perfect for learning CDC, event-driven architecture, and real-time data streaming.
By working with this project, you'll gain hands-on experience with:
- Understand how CDC captures database changes in real-time
- Learn about Debezium connectors and their configuration
- Practice with PostgreSQL logical replication
- Implement event-driven patterns using Kafka
- Learn about message brokers and event streaming
- Understand pub/sub messaging patterns
- Build WebSocket connections for real-time updates
- Implement connection management and broadcasting
- Handle client disconnections and reconnections
- Separate concerns between database, message broker, and API
- Implement scalable, decoupled systems
- Learn containerization with Docker Compose
- Register for Hacktoberfest 2025 (Sept 15 - Oct 31)
- Fork this repository
- Check our HACKTOBERFEST.md for available issues
- Pick an issue labeled
hacktoberfestorgood first issue - Make your contribution between October 1-31, 2025
hacktoberfest- Official Hacktoberfest issuesgood first issue- Perfect for beginnershelp wanted- Needs community attentionbug- Issues that need fixingenhancement- New features to implement
- π Bug fixes and improvements
- β¨ New features and enhancements
- π Documentation improvements
- π¨ UI/UX enhancements
- π§ͺ Tests and test coverage
- π§ Code optimization and refactoring
- Spam or low-quality contributions
- Duplicate PRs
- Whitespace-only changes
- Generated files
- PRs without associated issues
- Digital Badges: Unlock badges for each accepted PR
- Exclusive Swag: T-shirts for "Super Contributors" (first 10,000)
- Tree Contributions: Every 6th PR helps plant trees via TreeNation
- Learning: Hands-on experience with modern technologies
debezium-tester-app/
βββ app/
β βββ internal/
β β βββ consumer.py # Kafka consumer handling
β β βββ connection_manager.py # WebSocket connection manager
β βββ routes/
β β βββ websockets.py # WebSocket route definitions
β βββ templates/
β β βββ index.html # Frontend template
β βββ static/
β β βββ script.js # Client-side WebSocket logic
β β βββ styles.css # UI styling
β βββ main.py # Main FastAPI app entry
β βββ settings.py # Application configurations
βββ Dockerfile
βββ docker-compose.yaml # Docker services configuration
βββ pyproject.toml # Project metadata
βββ docs/ # Documentation directory
β βββ bruno_requests/ # Bruno requests for API interactions
β βββ queries/ # Example SQL queries for PostgreSQL setup
-
Build the Docker Image: Builds the application Docker image.
make build
-
Start Containers: Launches the application and associated services in detached mode.
make up
-
Run Linter: Checks for code style issues and fixes them with
ruff.make linter
-
Format Code: Applies code formatting with
ruff.make format
The /docs directory contains:
- bruno_requests/: A collection of example Bruno requests for setting up and testing Debezium connectors.
- queries/: Example SQL queries for configuring PostgreSQL, including table creation and WAL configuration.
These documents provide helpful resources for configuring the application environment and understanding its capabilities.
This project is licensed under the MIT License.