Digital Traffic Scoring Solution

Business Case

Create a sustainable and scalable method in which to track dealer conversion performance through meaningful KPI's and calculated scoring based off of predictive models created with the purpose of learning dealerships expected sales and service figures.

Whereas traditional models look to define expected performance based on geographical comparison and industry standards, this model aims to understand fluctuation in volume based on seasonality and similarly sized dealerhsips. Metrics pertaining to lead suppliers and breakdowns by lead type are future considerations.

The two main areas of focus are listed below and their respective KPI's that were proven to have strong correlation to the target metric, which in this case is actual sales. Service has no yet been implemented and is therefore left out. It follows a similar structure to the sales methodology.

Generating Sales:   VDP Views | Unique Visitors | Lead Volume | Visitor Conversions | Actual Sales

Managing Sales:     Close Rate | SDSV Close Rate | Sales Loyalty | % of Leads Responded to Within 30 Minutes

The below implementation was developed using AWS, Python, PySpark, and Tableau. The dack is a mock representation of data seen in the field due to PPI and data rights. It is intended as a mock representation of the process.

Architecture

The following diagram shows the general architecture used to create the models. The following is a more descriptive overview"

The mock data was created using a Jupyter Notebook running on EC2
Mock data is sent to S3 using Boto3 SDK
Event trigger begins Glue's Pyspark ETL handling
Data is reloaded into S3 in a full form and newly created form with unique ID for ML process
Minimized data set load triggers the Machine Learning Batch process
Batch process returns data to S3
Predictive results are combined with original file using PySpark data transformation
Results are stored in S3 and downloaded
Data imported into Tableau for visualization and further calculations

Data Recreation

The mock data for training the model was created using the two scripts below:

Generating

Managing

The batch prediction set of which the the report data is comprised is altered slightly both up and down for metrics to show enough variance that would simulate normal changes in dealerships. An example of this can be seen in modifying slight the mu and sigma of the normalized distribution in the sample file below:

Managing Sample

Regression Results

The log results of the ML training can be found here

Visualization

In order to visualize the results of the scoring system a Tableau dashboard was created to see how the process works with mock data. A sample image is shown below and the full hosted report can be found on Tableau public here.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Credentials		Credentials
Images		Images
README.md		README.md
bdc_etl_pyspark.py		bdc_etl_pyspark.py
bdc_generating_data_generator.ipynb		bdc_generating_data_generator.ipynb
bdc_managing_data_generator_altered.ipynb		bdc_managing_data_generator_altered.ipynb
bdc_managing_data_generator_training.ipynb		bdc_managing_data_generator_training.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Digital Traffic Scoring Solution

Business Case

Architecture

Data Recreation

Regression Results

Visualization

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Digital Traffic Scoring Solution

Business Case

Architecture

Data Recreation

Regression Results

Visualization

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages