https://github.com/STIWK3014-A242/class-activity-stiwk3014/blob/main/NewGroupMembers.md
- Matric Number & Name & Photo & Phone Number
- Mention who the leader is.
- Mention your group name for Assignment-1 and Assignment-2
- Other related info (if any)
TubeTrak: Real-Time YouTube Analytics via Kafka and Telegram Notifications (@YT_Analysis_bot)
-
Background
In today's digital landscape, YouTube has emerged as one of the most influential platforms for content creation and consumption. With over 2 billion logged-in monthly users and 500 hours of video uploaded every minute, content creators and marketers face significant challenges in monitoring and analyzing engagement metrics in real-time. Traditional analytics tools often provide delayed insights, limiting creators' ability to respond promptly to audience interactions.
-
Problem Statement (from article)
According to Isah and Zulkernine (2018), "the velocity and volume of data generated by social media platforms present significant challenges for real-time analytics systems." Their research highlights that "traditional batch processing methods are inadequate for scenarios requiring immediate insights and responses" (Isah & Zulkernine, 2018). This is particularly evident in YouTube analytics, where content creators face substantial delays in receiving engagement data through native platforms, preventing timely intervention and audience interaction. Venturelli (2024) further notes that "the lack of real-time notification systems for specific engagement patterns represents a critical gap in current social media analytics tools."
-
Main objective
TubeTrak aims to develop a real-time YouTube analytics system that processes comment data from multiple videos simultaneously, provides instant visualization of engagement metrics, and delivers targeted notifications for specific comment patterns. The system seeks to empower content creators with timely insights to enhance their engagement strategies and content optimization efforts.
-
Methodology
This project adopts an event-driven microservices architecture as recommended by Narkhede et al. (2017) for real-time data processing systems. Following the approach outlined by Bhogle (2023), we implemented three Spring Boot applications interconnected through Apache Kafka for asynchronous message processing. The producer service integrates with the YouTube Data API (Google Developers, n.d.) to fetch video metadata and comments, which are then published to dedicated Kafka topics. The consumer service, built using the Spring Kafka framework as described by Chirumamilla (2023), processes incoming data streams and provides analytics through a web dashboard with WebSocket connections for real-time updates. For notifications, we developed a Telegram bot service using the Telegram Bot API (Telegram, n.d.) that monitors specific data patterns. The entire system is containerized using Docker and Docker Compose following best practices outlined by Thiyagarajan and Nayak (2025) to ensure consistent deployment across environments.
-
Result
TubeTrak successfully demonstrates real-time processing of YouTube comment data with minimal latency (under 5 seconds). The system effectively handles multiple video streams simultaneously, providing instant analytics visualization through an interactive dashboard. The Telegram notification system accurately identifies and alerts users about comments with odd-length patterns, while the analytics dashboard displays metrics for even-length comments. Performance testing shows the system maintains stability with up to 5 concurrent video streams.
-
Conclusion
TubeTrak addresses the critical need for real-time YouTube analytics by leveraging modern data streaming technologies. The implementation of Apache Kafka with Spring Boot microservices provides a robust foundation for real-time data processing and visualization. The project demonstrates how event-driven architecture can be applied to social media analytics, offering content creators valuable insights without the delays associated with traditional analytics platforms. Future enhancements could include sentiment analysis, expanded notification criteria, and integration with additional social media platforms.
- Producer Image: pingseng/tubetrack-producer:1.0.0
- Consumer Image: pingseng/tubetrack-consumer:1.0.0
- Telegram Bot Image: pingseng/tubetrack-telegram-bot:1.0.0
- Clone the repository to your local machine.
- Make sure you have Docker and Docker Compose installed on your system.
- Navigate to the project root directory.
- Run the following command to start all services:
docker-compose up -d - To stop all services:
docker-compose down
| Endpoint | Method | Description |
|---|---|---|
/ |
GET | Main UI page for adding YouTube videos |
/api/videos |
GET | Get all tracked videos |
/api/videos |
POST | Add a new YouTube video for tracking |
/api/videos |
DELETE | Delete all videos |
/api/videos/{videoId} |
GET | Get a specific video by ID |
/api/videos/{videoId} |
DELETE | Delete a specific video |
/api/videos/{videoId}/status |
GET | Check processing status of a video |
/api/stats |
GET | Get statistics about tracked videos |
/api/videos/health |
GET | Health check endpoint |
| Endpoint | Method | Description |
|---|---|---|
/ |
GET | Analytics dashboard UI |
/api/analytics |
GET | Get analytics data |
/api/analytics |
DELETE | Delete all analytics data |
/api/videos/{videoId} |
GET | Get processed video data by ID |
/api/videos/{videoId} |
DELETE | Delete processed video data by ID |
| Endpoint | Description |
|---|---|
/ws |
WebSocket connection for real-time updates |
/topic/dashboard |
Topic for dashboard updates |
/topic/video/{videoId} |
Topic for specific video updates |
- Go to Google Cloud Console
- Create a new project or select an existing one
- Enable the YouTube Data API v3
- Create an API key
- Add the API key to the environment variables or directly in the docker-compose.yml file
- Talk to @BotFather on Telegram
- Create a new bot using the
/newbotcommand - Get the bot token
- Add the token to the environment variables or directly in the docker-compose.yml file
- Start the system using Docker Compose
docker-compose up -d - Find the Telegram bot (@YT_Analysis_bot) and send the
/startcommand to initialize it- To stop receiving notifications later, you can send the
/stopcommand to the bot
- To stop receiving notifications later, you can send the
- Open the Producer UI at
http://localhost:8080 - Add YouTube video URLs (up to 5 videos) using the form
- Wait for the system to process the videos (approximately 5-10 seconds)
- Open the Consumer Analytics Dashboard at
http://localhost:8081to view the analytics - Check your Telegram bot for notifications about comments with odd lengths
- Test the deletion functionality by removing videos from the Producer UI
- Verify that the Analytics Dashboard updates in real-time
- Use popular YouTube videos with many comments for best results
- The system processes new comments every 5 seconds
- The WebSocket connection provides real-time updates to the UI
- The Telegram bot only sends notifications for comments with odd lengths
- The Analytics Dashboard only displays data for comments with even lengths
- Amigoscode. (2022, February 3). Kafka tutorial - Spring Boot Microservices [Video]. YouTube. https://www.youtube.com/watch?v=SqVfCyfCJqw
- Apache Kafka. (n.d.). Kafka documentation. Apache. https://kafka.apache.org/documentation/
- Apache Kafka. (n.d.). Kafka Streams documentation. Apache. https://kafka.apache.org/documentation/streams/
- Bhogle, K. (2023, November 19). Real-time data processing with Apache Kafka and SpringBoot (โฆand MySQL): A journey continues. Medium. https://medium.com/@karanbhogle/real-time-data-processing-with-apache-kafka-and-springboot-and-mysql-a-journey-continues-d82bb2cc6dac
- Chirumamilla, P. (2023, October 9). Real-time data processing with Spring Boot and Kafka Streams. Medium. https://medium.com/@pradeepchirumamilla01/real-time-data-processing-with-spring-boot-and-kafka-streams-d10963b04438
- Clistas. (2023, October 17). Real-time analytics with Spring Boot: Mastering streaming queries and an advanced suite of services for instant insights. Medium. https://medium.com/@clistastech/real-time-analytics-with-spring-boot-mastering-streaming-queries-and-an-advanced-suite-of-3cfb497178d8
- Confluent. (n.d.). Introduction to Apache Kafka. https://www.confluent.io/what-is-apache-kafka/
- Developersmonk. (2025, February 16). Build a Telegram YouTube Video Search Bot with Spring Boot & YouTube API! [Video]. YouTube. https://www.youtube.com/watch?v=oBDuTX_SSds
- Docker. (n.d.). Docker documentation. https://docs.docker.com/
- GeeksforGeeks. (2024, September 6). Spring Boot Integration with Kafka. GeeksforGeeks. https://www.geeksforgeeks.org/advance-java/spring-boot-integration-with-kafka/
- Globant. (2024, May 23). Docker Compose Basics: Connect Spring Boot to MySQL and Kafka | Docker tutorial [Video]. YouTube. https://www.youtube.com/watch?v=bzmnEGfOabU
- Google Developers. (n.d.). YouTube Data API overview. https://developers.google.com/youtube/v3
- Isah, H., & Zulkernine, F. (2018). A scalable and robust framework for data stream ingestion. arXiv. https://arxiv.org/abs/1812.04197
- Leonard, A. (2019). Data stream development with Apache Spark, Kafka, and Spring Boot [Video]. Packt Publishing. https://www.oreilly.com/library/view/data-stream-development/9781789539585/
- Moumie. (2024, September 22). kafka01: Build a Java Spring Boot App with Apache Kafka in Docker โ Step by Step Tutorial! [Video]. YouTube. https://www.youtube.com/watch?v=NlqoZCRcnoE
- Narkhede, N., Shapira, G., & Palino, T. (2017). Kafka: The definitive guide. O'Reilly Media. https://www.confluent.io/resources/ebook/kafka-the-definitive-guide/
- Rangdal, M. (2024, August 31). Using Apache Kafka to build pipelines for streaming data. Medium. https://medium.com/@rangdalmayura/using-apache-kafka-to-build-pipelines-for-streaming-data-d5f9160a9b28
- Raptis, T. P., & Passarella, A. (2022). On efficiently partitioning a topic in Apache Kafka. arXiv. https://arxiv.org/abs/2205.09415
- Saket, S., Chandela, V., & Kalim, M. D. (2024). Real-time event joining in practice with Kafka and Flink. arXiv. https://arxiv.org/abs/2410.15533
- Spring Boot. (n.d.). Spring Boot reference documentation. https://docs.spring.io/spring-boot/docs/current/reference/html/
- Sprenger, S. (2025). Streaming data pipelines with Kafka. Manning Publications. https://www.manning.com/books/streaming-data-pipelines-with-kafka
- Telegram. (n.d.). Telegram Bot API. https://core.telegram.org/bots/api
- Thiyagarajan, G., & Nayak, P. (2025). Docker under siege: Securing containers in the modern era. arXiv. https://arxiv.org/abs/2506.02043
- Ughele, E. (2024, March 28). Real-time data pipelines with Apache Kafka. Medium. https://medium.com/@ehibhahiemenughele/real-time-data-pipelines-with-apache-kafka-8664f757c159
- Venturelli, I. (2024, August 5). Spring Boot and Kafka: Real-time data processing. Medium. https://medium.com/codex/spring-boot-and-kafka-real-time-data-processing-ccbdc6a28e11
- Waehner, K. (2025, February 5). Data streaming use cases and industry success stories featuring Apache Kafka and Flink. https://www.kai-waehner.de/blog/2025/02/05/free-ebook-data-streaming-use-cases-and-industry-success-stories-featuring-apache-kafka-and-flink/
-
AWS account
-
.pemkey pair file -
Telegram Bot token
-
Project folder with:
docker-compose.yml- Spring Boot JAR files
- MySQL init script (if needed)
-
Go to AWS EC2 Console โ Launch Instance
-
Choose:
- Amazon Linux 2023 AMI
t3.mediuminstance type- Upload or select your PEM key (
.pemkey pair file)
-
Under Network settings:
- Open inbound ports:
22,8080, and8081(use 0.0.0.0/0 for testing/demo access)
- Open inbound ports:
ssh -i "C:/path/to/your/key.pem" ec2-user@<your-ec2-public-ip>sudo yum update -y
sudo yum install -y docker
sudo systemctl start docker
sudo systemctl enable dockersudo curl -L https://github.com/docker/compose/releases/download/v2.24.0/docker-compose-linux-x86_64 -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
docker-compose versionFrom your local machine:
scp -i "C:/path/to/your/key.pem" -r "C:/path/to/your/project" ec2-user@<your-ec2-public-ip>:/home/ec2-user/cd /home/ec2-user/project
docker compose up -dCheck status:
docker ps- Producer Web UI:
http://<your-ip>:8080(e.g.: http://18.141.225.125:8080/) - Consumer Dashboard:
http://<your-ip>:8081(e.g.: http://18.141.225.125:8081/)











