Improve speed of train/selection/validation split function

Currently, the funciton is using `train_test_split()` from sklearn twice, but with large datasets, the functions becomes slow and memory demanding due to the fact that we are creating multiple dataframes.
The solution would be just to append a list with the split `[train, selection, train, validation ... ]`

https://github.com/PythonPredictions/cobra/blob/914131382b7ccb6f939edcef788a4851e28d7cdd/cobra/preprocessing/preprocessor.py#L340

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve speed of train/selection/validation split function #53

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve speed of train/selection/validation split function #53

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions