Rental Listing Interest Prediction

This repository contains a machine learning pipeline for predicting interest levels in rental listings as part of the Two Sigma Connect: Rental Listing Inquiries challenge.

Installation

Install the required dependencies:

pip install pandas numpy scikit-learn xgboost matplotlib tqdm

For deep learning models, additional dependencies are required:

pip install torch tensorflow transformers

Usage

Feature Engineering

Generate the feature dataset (this is a critical step and must be run first):

python run_features.py

Model Training and Evaluation

Train and evaluate traditional machine learning models:

python run.py --model xgb             # Train XGBoost model
python run.py --model rf              # Train Random Forest model
python run.py --model dt              # Train Decision Tree model
python run.py --model ridge           # Train Ridge Regression
python run.py --model lasso           # Train Lasso Regression
python run.py --model elasticnet      # Train ElasticNet model
python run.py --model svm             # Train SVM model
python run.py --model knn             # Train KNN model
python run.py --model logistic        # Train Logistic Regression model

For deep learning models, run the specific Python files directly:

python src/models/neural_network.py       # Basic Neural Network
python src/models/advanced_nn.py          # Advanced Neural Network
python src/models/transformer_model.py    # Transformer Model
python src/models/cnn_model.py            # Convolutional Neural Network
python src/models/rnn_model.py            # Recurrent Neural Network

Model Training and Prediction

Train a model and generate prediction results:

python run.py --model xgb --predict   # Train XGBoost and generate predictions

Model Comparison

Compare the performance of multiple models:

python run.py --compare --models xgb rf logistic knn

Ensemble Models

Train an ensemble model:

# Average ensemble method
python run.py --ensemble --models xgb rf logistic

# Weighted ensemble method
python run.py --ensemble --ensemble-method weighted --models xgb rf

# Train and predict with ensemble model
python run.py --ensemble --predict

Supported Models

Abbreviation	Model Name	Features
xgb	XGBoost	Gradient boosting trees, handles complex relationships
rf	Random Forest	Ensemble of decision trees, good stability
dt	Decision Tree	Simple, transparent, easy to interpret
logistic	Logistic Regression	Linear model with probability output
ridge	Ridge Regression	L2 regularized linear model
lasso	Lasso Regression	L1 regularized, feature selection
elasticnet	ElasticNet	L1+L2 regularized regression
svm	Support Vector Machine	Powerful non-linear classification
knn	K-Nearest Neighbors	Similarity-based simple classification
nn	Neural Network	Basic feedforward neural network
adv_nn	Advanced Neural Network	Deeper architecture with advanced training
transformer	Transformer Model	Attention-based architecture
cnn	Convolutional Neural Network	Image-inspired architecture
rnn	Recurrent Neural Network	Sequence modeling architecture

Ensemble Methods

Two ensemble learning methods are supported:

Average: Equal weight averaging of predictions from multiple models
Weighted: Dynamic weight allocation based on cross-validation performance

Experiment Results

Each experiment creates a timestamped folder in the outputs/ directory containing:

Trained model
Prediction results
Detailed evaluation metrics (loss, F1 scores, etc.)
Cross-validation results
Training metadata (time, duration, etc.)

Custom Configuration

Edit the config.py file to modify:

Data paths
Random seed
Cross-validation folds
Test set proportion
Model hyperparameters

Project Structure

├── run.py                  # Main execution script
├── run_features.py         # Feature engineering script
├── config.py               # Configuration settings
├── src/
│   ├── features/           # Feature generation modules
│   ├── models/             # Model implementations
│   │   ├── xgb_model.py    # XGBoost implementation
│   │   ├── rf_model.py     # Random Forest implementation
│   │   ├── neural_network.py  # Neural Network implementation
│   │   └── ...
│   ├── utils/              # Utility functions
│   └── evaluation/         # Evaluation metrics
├── data/                   # Data directory containing input files
└── outputs/                # Model outputs and results

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
report		report
src		src
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
config.py		config.py
main.py		main.py
requirements.txt		requirements.txt
run.py		run.py
run_features.py		run_features.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rental Listing Interest Prediction

Installation

Usage

Feature Engineering

Model Training and Evaluation

Model Training and Prediction

Model Comparison

Ensemble Models

Supported Models

Ensemble Methods

Experiment Results

Custom Configuration

Project Structure

About

Uh oh!

Releases

Packages

Languages

Car-pe/ML_project

Folders and files

Latest commit

History

Repository files navigation

Rental Listing Interest Prediction

Installation

Usage

Feature Engineering

Model Training and Evaluation

Model Training and Prediction

Model Comparison

Ensemble Models

Supported Models

Ensemble Methods

Experiment Results

Custom Configuration

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages