This repository contains the complete source code, documentation, and presentation of my Final Degree Project in Computer Engineering at the University of La Laguna (ULL).
The work was also presented at the II Student Congress of Computer Engineering (2016).
The project focuses on applying data mining techniques to predict student dropouts in Massive Open Online Courses (MOOCs), using real datasets from the KDD Cup 2015 challenge.
- Classify students enrolled in MOOCs according to whether they complete or abandon the course.
- Use open-source tools for the entire pipeline: from data storage to model execution.
- Develop a Java desktop application integrating RapidMiner Studio 7.0 operators.
- Apply professional software engineering techniques for maintainable and scalable development.
- Java (Swing for GUI).
- MariaDB.
- Apache Maven.
- RapidMiner Studio 7.0
- JUnit (testing).
- Git / GitHub.
- Doxygen (documentation).
- Data preprocessing and feature engineering.
- Multiple classification algorithms: Decision Trees, k-NN, Naive Bayes.
- Observer, Strategy, and MVC design patterns.
- Integration with RapidMiner’s internal operators.
- Case study with real MOOC dataset (120,000+ records).
Developed by Manuel Bacallado.
Data Mining · MOOC · Dropout Prediction · Java · MariaDB · RapidMiner · Final Degree Project · Educational Data Mining.