Instant -> Traducción

Welcome to our project! We are Neil de la Fuente, Nil Biescas, Xavier Soto, Jordi Longaron, and Daniel Vidal, and we have joined forces to revolutionize the way iDisc, a translation company, assigns tasks to its translators.

Project Overview

Our mission is to assist project managers at iDisc in making task assignments more efficient and effective. To achieve this, we have developed several machine learning models, including a Random Forest with Decision Trees and a Multilayer Perceptron (MLP). These models take into account various factors such as previous tasks completed by translators, client preferences, and features of the task at hand. The output is a list of top-k candidates best suited for a given task, making the assignment process streamlined and informed.

Repository Structure

Decision_Trees: This directory contains Jupyter notebooks for the models we've built using decision trees. The notebooks included are "DecisionTrees_synthesis.ipynb" and "randomforest_synthesis.ipynb".
Models: This directory contains the model used to trained the MLP.
CheckPoints: This directory contains checkpoint files of the different models we've experimented with, each having its unique configuration, such as batch sizes and the use of dropout techniques.
Utils: Inside this directory you will find three files:

      1. Utils.py used to obtain the dataloader
      2. organaizer.py used to organize the training and validation of the model
      3. utils_Dataset.py used to preprocess all the data from the .pkl file

TKinter: This directory contains a python file using tkinter to create the interface of the project. For the in depth explanation access the folder.

Data

Here you have a link for the data needed for each of the models (Might be different data due to the difference between decision trees and neural networks):

Before the data is fed into our models, it undergoes a thorough preprocessing. This includes cleaning, normalization, and feature extraction, ensuring that our models receive quality data that helps them make the best predictions.

Installation and Usage

Before starting with the usage, ensure Python 3.x is installed on your system. If it is not, you can download it here. Next, clone the project from GitHub to your local machine using the command:

git clone https://github.com/NilBiescas/Synthesis_Project.git

Executing the MLP

To run the program you will need to do update the path to the data downloaded for the MLP. The variable that will need to be changed is found in the main.py file and it is name pkl_file.

python main.py

Executing the Trees and Random Forest

Just download the notebooks, upload the data and run all the cells, yes, it´s that easy!

Performance

Our models have shown promising results in optimizing the task assignment process. The Decision Tress, The Random Forest model and the MLP model achieved the following performance:

Model	Accuracy	Recall	F1-Score
Decision Trees	71%	68%	69%
Random Forest	82%	79%	80%
MLP	84%	81%	81%

The Multilayer Perceptron (MLP) model is achieving better performance primarily due to its higher complexity and better capacity to model intricate non-linear relationships, something that gives MLP an edge when dealing with complex task assignment data. Its learning method, backpropagation, allows it to learn from its errors, incrementally improving its performance as it processes more data. Additionally, MLPs tend to perform better with high-dimensional data, particularly when there are sophisticated interactions between features. These qualities make it adept at handling the complexity of our dataset, contributing to its superior performance in comparison to the Decision Trees. The DT's on the other hand are a proof that a simple model also can work quite well, in this case thanks to the unbiased approaach they have. Finally, and with similar results comparing it to the MLP we have the Random Forest, which is a ensemble learning method based on voting, it mixes the result of several trees to provide a more consistent and confident response. The performance measures are based on the accuracy of the task assignment. We continue to improve and optimize these models. A deeper analysis will overgo on the report.

How to Contribute

We welcome contributions! If you're interested in improving our models, fixing bugs, or adding new features, please feel free to make a pull request.

Want to know more?

Soon the report on the project will be available for you to have a deeper understanding of our work. Stay tuned for updates!

Contact

For any inquiries or issues, feel free to reach out to us:

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
CheckPoints		CheckPoints
Decision_Trees		Decision_Trees
Models		Models
Tkinter		Tkinter
Utils		Utils
__pycache__		__pycache__
README.md		README.md
Visualizations.ipynb		Visualizations.ipynb
main.py		main.py
train.py		train.py
validation.py		validation.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Instant -> Traducción

Table of Contents

Project Overview

Repository Structure

Data

Installation and Usage

Executing the MLP

Executing the Trees and Random Forest

Performance

How to Contribute

Want to know more?

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

NilBiescas/Synthesis_Project

Folders and files

Latest commit

History

Repository files navigation

Instant -> Traducción

Table of Contents

Project Overview

Repository Structure

Data

Installation and Usage

Executing the MLP

Executing the Trees and Random Forest

Performance

How to Contribute

Want to know more?

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages