Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,15 @@
1. [Contribution guidelines](./.github/CONTRIBUTING.md)
1. [Project documents](./docs)
1. [Approach](./docs/Approach.pdf)
1. [Available modules](./mllib/lib)
1. [Clustering](./mllib/lib/cluster.py) - determines optimal _k_
1. [GLMNet](./mllib/lib/model.py) - classification/regression
1. [k-nearest neighbours](./mllib/lib/knn.py) - classification/regression
1. [Random forest](./mllib/lib/tree.py) - classification/timeseries/regression
1. [XGBoost](./mllib/lib/tree.py) - classification/timeseries/regression
1. [Traveling salesman problem](./mllib/lib/opt.py) - integer programming/heuristic
1. [Transportation problem](./mllib/lib/opt.py) - integer programming
1. [Time series](./mllib/lib/timeseries.py)
1. [Pull request guidelines](./.github/PULL_REQUEST_TEMPLATE.md)
1. [Initial setup](./README.md#initial-setup)
1. [Unit tests](./README.md#run-unit-tests-and-pylint-ratings)
Expand Down
6 changes: 2 additions & 4 deletions logs/cov.out
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@
Name Stmts Miss Cover Missing
---------------------------------------------------------------------------------------------
/media/ph33r/Data/Project/CodeLib/Git/mllib/__init__.py 7 0 100%
/media/ph33r/Data/Project/CodeLib/Git/mllib/lib/__init__.py 7 0 100%
/media/ph33r/Data/Project/CodeLib/Git/mllib/lib/cluster.py 103 0 100%
/media/ph33r/Data/Project/CodeLib/Git/mllib/lib/knn.py 70 0 100%
/media/ph33r/Data/Project/CodeLib/Git/mllib/lib/model.py 44 0 100%
/media/ph33r/Data/Project/CodeLib/Git/mllib/lib/model.py 52 0 100%
/media/ph33r/Data/Project/CodeLib/Git/mllib/lib/opt.py 157 0 100%
/media/ph33r/Data/Project/CodeLib/Git/mllib/lib/timeseries.py 60 0 100%
/media/ph33r/Data/Project/CodeLib/Git/mllib/lib/tree.py 158 0 100%
---------------------------------------------------------------------------------------------
TOTAL 606 0 100%
TOTAL 600 0 100%
4 changes: 0 additions & 4 deletions logs/pylint/lib-__init__-py.out

This file was deleted.

10 changes: 5 additions & 5 deletions logs/pylint/lib-knn-py.out
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
************* Module mllib.lib.knn
knn.py:176:45: I1101: Module 'metrics' has no 'rsq' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
knn.py:177:45: I1101: Module 'metrics' has no 'mae' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
knn.py:178:46: I1101: Module 'metrics' has no 'mape' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
knn.py:179:46: I1101: Module 'metrics' has no 'rmse' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
************* Module knn
knn.py:175:45: I1101: Module 'metrics' has no 'rsq' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
knn.py:176:45: I1101: Module 'metrics' has no 'mae' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
knn.py:177:46: I1101: Module 'metrics' has no 'mape' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
knn.py:178:46: I1101: Module 'metrics' has no 'rmse' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)

--------------------------------------------------------------------
Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)
Expand Down
10 changes: 5 additions & 5 deletions logs/pylint/lib-model-py.out
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
************* Module mllib.lib.model
model.py:166:41: I1101: Module 'metrics' has no 'rsq' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
model.py:167:41: I1101: Module 'metrics' has no 'mae' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
model.py:168:42: I1101: Module 'metrics' has no 'mape' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
model.py:169:42: I1101: Module 'metrics' has no 'rmse' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
************* Module model
model.py:180:41: I1101: Module 'metrics' has no 'rsq' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
model.py:181:41: I1101: Module 'metrics' has no 'mae' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
model.py:182:42: I1101: Module 'metrics' has no 'mape' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
model.py:183:42: I1101: Module 'metrics' has no 'rmse' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)

--------------------------------------------------------------------
Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)
Expand Down
10 changes: 5 additions & 5 deletions logs/pylint/lib-timeseries-py.out
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
************* Module mllib.lib.timeseries
timeseries.py:201:41: I1101: Module 'metrics' has no 'rsq' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
timeseries.py:202:41: I1101: Module 'metrics' has no 'mae' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
timeseries.py:203:42: I1101: Module 'metrics' has no 'mape' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
timeseries.py:204:42: I1101: Module 'metrics' has no 'rmse' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
************* Module timeseries
timeseries.py:201:41: I1101: Module 'metrics' has no 'rsq' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
timeseries.py:202:41: I1101: Module 'metrics' has no 'mae' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
timeseries.py:203:42: I1101: Module 'metrics' has no 'mape' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
timeseries.py:204:42: I1101: Module 'metrics' has no 'rmse' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)

--------------------------------------------------------------------
Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)
Expand Down
10 changes: 5 additions & 5 deletions logs/pylint/lib-tree-py.out
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
************* Module mllib.lib.tree
tree.py:96:45: I1101: Module 'metrics' has no 'rsq' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
tree.py:97:45: I1101: Module 'metrics' has no 'mae' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
tree.py:98:46: I1101: Module 'metrics' has no 'mape' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
tree.py:99:46: I1101: Module 'metrics' has no 'rmse' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
************* Module tree
tree.py:96:45: I1101: Module 'metrics' has no 'rsq' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
tree.py:97:45: I1101: Module 'metrics' has no 'mae' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
tree.py:98:46: I1101: Module 'metrics' has no 'mape' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
tree.py:99:46: I1101: Module 'metrics' has no 'rmse' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)

--------------------------------------------------------------------
Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)
Expand Down
4 changes: 0 additions & 4 deletions logs/pylint/mllib-__init__-py.out

This file was deleted.

7 changes: 0 additions & 7 deletions logs/pylint/tests-test_metrics-py.out
Original file line number Diff line number Diff line change
@@ -1,10 +1,3 @@
************* Module tests.test_metrics
test_metrics.py:61:22: I1101: Module 'mllib.lib.metrics' has no 'rsq' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
test_metrics.py:69:22: I1101: Module 'mllib.lib.metrics' has no 'mse' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
test_metrics.py:77:22: I1101: Module 'mllib.lib.metrics' has no 'rmse' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
test_metrics.py:85:22: I1101: Module 'mllib.lib.metrics' has no 'mae' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
test_metrics.py:93:22: I1101: Module 'mllib.lib.metrics' has no 'mape' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
test_metrics.py:101:22: I1101: Module 'mllib.lib.metrics' has no 'aic' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)

--------------------------------------------------------------------
Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)
Expand Down
27 changes: 0 additions & 27 deletions mllib/__init__.py

This file was deleted.

27 changes: 0 additions & 27 deletions mllib/lib/__init__.py

This file was deleted.

7 changes: 3 additions & 4 deletions mllib/lib/knn.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

**Available routines:**

- class ``KNN``: Builds K-Nearest Neighnour model using cross validation.
- class ``KNN``: Builds K-Nearest Neighbours model using cross validation.

Credits
-------
Expand Down Expand Up @@ -123,7 +123,7 @@ def __init__(self,
self.model = None
self.k_fold = k_fold
if param is None:
max_k = max(int(len(self.df)/(self.k_fold * 2)), 1)
max_k = max(int(len(self.df) / (self.k_fold * 2)), 1)
param = {"n_neighbors": list(range(1, max_k, 2)),
"weights": ["uniform", "distance"],
"metric": ["euclidean", "manhattan"]}
Expand Down Expand Up @@ -163,8 +163,7 @@ def _fit(self) -> Dict[str, Any]:
return_train_score=True,
cv=self.k_fold,
n_jobs=-1)
gs_op = gs.fit(self.df[self.x_var],
self.df[self.y_var])
gs_op = gs.fit(self.df[self.x_var], self.df[self.y_var])
self.model = gs_op
return gs_op.best_params_

Expand Down
16 changes: 15 additions & 1 deletion mllib/lib/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,11 @@
"""

# pylint: disable=invalid-name
# pylint: disable=R0902,R0903,R0913,C0413
# pylint: disable=R0902,R0903,R0913,C0413,W0511

from typing import List, Dict

import warnings
import re
import sys
from inspect import getsourcefile
Expand All @@ -42,6 +43,17 @@
# =============================================================================


def ignore_warnings(test_func):
"""Suppress warnings."""

def do_test(self, *args, **kwargs):
with warnings.catch_warnings():
warnings.simplefilter("ignore")
test_func(self, *args, **kwargs)

return do_test


class GLMNet():
"""GLMNet module.

Expand Down Expand Up @@ -133,6 +145,8 @@ def __init__(self,
self._fit()
self._compute_metrics()

# TODO: Remove this once GLMNet is updated
@ignore_warnings
def _fit(self) -> None:
"""Fit the best GLMNet model."""
train_x, test_x,\
Expand Down
4 changes: 2 additions & 2 deletions mllib/lib/opt.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,9 +46,9 @@

class TSP:
"""
Travelling salesman problem.
Traveling salesman problem.

Module for `Travelling salesman problem
Module for `Traveling salesman problem
<https://en.wikipedia.org/wiki/Travelling_salesman_problem>`_ using
integer programming or nearest neighbour algorithm.

Expand Down
17 changes: 9 additions & 8 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
pmdarima==1.8.0
xgboost==1.5.0
scipy==1.4.1
xlrd==1.2.0
PuLP==1.6.8
numpy==1.18.1
pandas==1.0.1
Cython==0.29.15
statsmodels==0.11.0
xgboost==1.5.0
statsmodels==0.13.0
openpyxl==3.0.9
pandas==1.3.5
numpy==1.21.2
Cython==0.29.25
pmdarima==1.8.1
xlrd==2.0.1
scipy==1.7.3
scikit_learn==1.0.2
19 changes: 8 additions & 11 deletions tests/test_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ def do_test(self, *args, **kwargs):
with warnings.catch_warnings():
warnings.simplefilter("ignore")
test_func(self, *args, **kwargs)

return do_test


Expand All @@ -63,21 +64,17 @@ def setUp(self):
def test_known_equation(self):
"""GLMNet: Test a known equation"""
df_ip = pd.read_csv(path + "test_glmnet.csv")
mod = GLMNet(df=df_ip,
y_var="y",
x_var=["x1", "x2", "x3"])
mod = GLMNet(df=df_ip, y_var="y", x_var=["x1", "x2", "x3"])
op = mod.opt
self.assertEqual(np.round(op.get('intercept'), 0), 100.0)
self.assertEqual(np.round(op.get('coef')[0], 0), 2.0)
self.assertEqual(np.round(op.get('coef')[1], 0), 3.0)
self.assertEqual(np.round(op.get('coef')[2], 0), 0.0)
self.assertEqual(np.round(op.get("intercept"), 0), 100.0)
self.assertEqual(np.round(op.get("coef")[0], 0), 2.0)
self.assertEqual(np.round(op.get("coef")[1], 0), 3.0)
self.assertEqual(np.round(op.get("coef")[2], 0), 0.0)

def test_predict_target_variable(self):
"""GLMNet: Test to predict a target variable"""
df_ip = pd.read_csv(path + "test_glmnet.csv")
mod = GLMNet(df=df_ip,
y_var="y",
x_var=["x1", "x2", "x3"])
mod = GLMNet(df=df_ip, y_var="y", x_var=["x1", "x2", "x3"])
df_predict = pd.DataFrame({"x1": [10, 20],
"x2": [5, 10],
"x3": [100, 0]})
Expand All @@ -91,5 +88,5 @@ def test_predict_target_variable(self):
# --- Main
# =============================================================================

if __name__ == '__main__':
if __name__ == "__main__":
unittest.main()
3 changes: 2 additions & 1 deletion tests/test_timeseries.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@

import pandas as pd
import xlrd
import openpyxl

# Set base path
path = abspath(getsourcefile(lambda: 0))
Expand All @@ -34,7 +35,7 @@

from mllib.lib.timeseries import AutoArima # noqa: F841

__all__ = ["xlrd", ]
__all__ = ["xlrd", "openpyxl", ]

# =============================================================================
# --- DO NOT CHANGE ANYTHING FROM HERE
Expand Down
4 changes: 4 additions & 0 deletions tests/test_tree.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@
from os.path import abspath

import pandas as pd
import xlrd
import openpyxl

from sklearn.model_selection import train_test_split as split
from sklearn import metrics as sk_metrics
Expand All @@ -37,6 +39,8 @@
from mllib.lib.tree import RandomForest # noqa: F841
from mllib.lib.tree import XGBoost # noqa: F841

__all__ = ["xlrd", "openpyxl", ]

# =============================================================================
# --- DO NOT CHANGE ANYTHING FROM HERE
# =============================================================================
Expand Down