You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
If one has categorical variables, that needs to be 1-hot encoded and the resulting matrix can be 100x or even 1000x bigger if dense rather than sparse representation is used. It seems mxnet currently supports only dense, and therefore it's easy to hit RAM limitations even with fairly small datasets on the largest EC2 GPU box g2.8xlarge with 60GB RAM.
For example a 10M row sample of the well knows airline dataset is ~2GB in sparse representation, but over 60GB in dense.