Eco-Forest is more effective version of gcForest
- unbalanced dataset
- Ref. imbalanced-learn of sklearn. http://contrib.scikit-learn.org/imbalanced-learn/stable/
- missing value
- Ref.
- feature combination
- Ref. FM, FFM
- layer/deep structure
- Ref. gcForest, eForest
- discrete or continuous variable
- abnormal data
- Analysis sklearn Tree source code (DONE)
- Analysis sklearn ensemble source code (DONE)
- Add Utils code: ForestUtils.py (DONE)
- Add Utils code: EnhancedDTree.py (DONE)
- Add Utils code: EnhancedForest.py (DONE)
- Case Study: Forest Driver dataset test (DONE)
- LayerForest v0.1 - LayerDTree (DONE)
- finish layer structure of DTree
- spliting the data by the gini value of leaf node
- test model by globel vaild data
- output the test data
- LayerForest v0.2 - LayerDTree (DONE)
- driver dataset
- EnhancedDTree.py
- LayerForest v0.3 - LayerDTree (DONE)
- uci adult dataset
- compare with xgb, rf, decisiontree
- EnhancedDTree.py
- LayerForest v0.4 - LayerDTree++ (DONE)
- eliminate the overfit result
- k-fold
- LayerForest v0.5 - LayerForest (DONE)
- EnhancedForest.py
- bug: misconvergence
- LayerForest v0.6 - LayerForest (ING)
- [D] debug: eliminate miscovergence [Dropout \ Batch Normalization]
- debug: eliminate overfit
- debug: eliminate overquick covergence [?]
- exceed xgb, rf, decisiontree
- all to do:
- k-fold train [v]
- avg predict [v]
- threshold of all imp - avg [v]
- dropout by score of est [v]
- dropout by score of tree [x]
- LR predict [v]
- vaild data split [v]
- LayerForest v0.7 (Done)
- Simplify Procedure
- Data Load Utils
- Multiclass Support
- LayerForest v0.8 (ING)
- Simplify Procedure
- Model Utils
- DecomposerForest
- AlgorithmUtils
- LayerForest v1.0 - ecoForest
- Vaild Data Split
- MaxLayer Control
- Train/Vaild Loss Guide
- Freq/Lift/Support Score
- K-Flod
- Smart Early Stop
- LR Stacker
EnhancedForest_multiclass_v0.2: before AlgorithmUtils. 12.10
- Kaggle Datasets:
- UCI Datasets
Happy Hacking.