What needs to happen?
Depending on the API Spark uses different defaults as storage level:
- RDD API:
MEMORY_ONLY
- Dataset API:
MEMORY_AND_DISK
Currently, the default storage level is set to MEMORY_ONLY for all runners. However, the default storage level of the Dataset runner should match Sparks default MEMORY_AND_DISK.
Issue Priority
Priority: 3 (nice-to-have improvement)
Issue Components