FinML-Toolkit is a comprehensive Python library designed to streamline the full workflow of applying Machine Learning (ML) to financial time series data. It provides a modular and extensible framework for:
- Data handling
- Feature engineering
- Model training & evaluation
- Signal generation
- Interactive visualization
- Robust Data Handling: Load and filter financial time series from
.pklfiles using date ranges. - Advanced Feature Engineering: Add technical indicators (e.g., RSI, MACD), volume analysis, and more.
- Flexible ML Architectures:
- LSTM, GRU, Conv1D, Attention, Transformer (via TensorFlow/Keras)
- RandomForest, XGBoost, CatBoost, LightGBM (via scikit-learn & boosting libraries)
- Imbalanced Data Handling: SMOTE, ADASYN, NearMiss, TomekLinks, SMOTETomek, SMOTEENN, etc.
- Class Weight Optimization: Optimize with
scipy.optimize.minimize(Nelder-Mead, SLSQP, etc.) - Model Evaluation: Accuracy, precision, recall, F1-score, MSE, RMSE, R², confusion matrix.
- Interactive Plotting: Plotly-based heatmaps, forecasting charts, actual vs. predicted lines/candlesticks, and trading signal visualizations.
- Modular Design: Clean structure with dedicated modules for each pipeline stage.
FinML-Toolkit/
├── setup.py
├── examples/
│ ├── Basic_ML_Model_Training.ipynb
│ ├── ...
├── fin_data/
│ ├── EUR_USD_H4.pkl
│ └── ...
└── ml_toolkit/
├── __init__.py
├── data_handler.py
├── error_handler.py
├── feature_engineer.py
├── model_builder.py
├── model_evaluator.py
├── model_executor.py
├── model_forecaster.py
└── visualizer.py
- Python 3.8+
- Recommended: Use a virtual environment
python -m venv venv
source venv/bin/activate # Linux/macOS
# venv\Scripts\activate # WindowsFrom the project root (where setup.py is):
pip install -e .from ml_toolkit.data_handler import DataHandler
df = DataHandler.load_data('GBP_USD_H4.pkl', start_date="2020-01-01", end_date="2024-07-29 21:00")from ml_toolkit.feature_engineer import FeatureEngineer
df, time, X, y = FeatureEngineer.add_features(df=df, features=['RSI', 'MACD', 'Volume'])from ml_toolkit.model_builder import ModelBuilder
input_shape = (X.shape[1], X.shape[2]) if X.ndim == 3 else (X.shape[1],)
model = ModelBuilder.create_lstm_model(input_shape=input_shape, output_units=3, output_activation='softmax')from ml_toolkit.model_executor import ModelExecutor
model, acc, prec, rec, f1, conf, mse, rmse = ModelExecutor.execute_model(
file_path='GBP_USD_H4.pkl',
start_date="2020-01-01",
end_date="2024-07-29 21:00",
features=['RSI', 'MACD', 'Volume'],
model_type='LSTM',
epochs=10
)from ml_toolkit.visualizer import Visualizer
# visualizer = Visualizer()
# visualizer.plot_actual_vs_predicted_lines(y_test, y_pred)Basic_ML_Model_Training.ipynbBasic_Regression_Model_Training.ipynbClass_Weight_Optimization.ipynbFeature_Selection_Optimization.ipynbImbalanced_Dataset_Evaluation.ipynbCorrelation_Heatmap_Visualization.ipynbPeak_Detection_Visualization.ipynbTrades_Evaluation.ipynb
Sample output from the notebook (note that predictions for M15 are far better than H4):
- Debugging report
- Trades log
- Trades metrics
- Modular, extensible codebase.
- Add your own model logic in
model_builder.py. - Financial data expected in
.pklformat infin_data/.
- Custom indicators must be added manually.
- Optimization methods may need tuning.
- Predictions at shorter time scales, e.g. M15, are far better than those at longer time scales, e.g. H4.
- Live data feed integration (e.g., Oanda, Binance)
- Backtesting module
- SHAP/LIME for explainability
- More advanced forecasting models
- Model deployment utilities
git clone https://github.com/a-dorgham/FinML-Toolkit.git
cd FinML-Toolkit
# Create a branch and submit a PR- Email: a.k.y.dorgham@gmail.com
- GitHub Issues: FinML-Toolkit Issues
- Connect: LinkedIn | GoogleScholar | ResearchGate | ORCiD


