feat: Add config max_temp_directory_size to limit max disk usage for spilling queries#15520
feat: Add config max_temp_directory_size to limit max disk usage for spilling queries#15520alamb merged 6 commits intoapache:mainfrom
max_temp_directory_size to limit max disk usage for spilling queries#15520Conversation
max_temp_directory_size to limit max disk usage for spilling queries
There was a problem hiding this comment.
Pull Request Overview
This PR adds a new configuration to limit temporary disk usage for spilling queries by introducing the max_temp_directory_size setting in DiskManager and updating related components.
- Adds max_temp_directory_size and used_disk_space to DiskManager to track and enforce the disk usage limit.
- Updates RefCountedTempFile and InProgressSpillFile to update the global disk usage after file modifications.
- Introduces integration tests to verify behavior when the disk spill limit is reached versus not reached.
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| datafusion/physical-plan/src/spill/spill_manager.rs | Updated error documentation for spill functions. |
| datafusion/physical-plan/src/spill/in_progress_spill_file.rs | Enhanced error docs and updated disk usage after appending batches. |
| datafusion/execution/src/disk_manager.rs | Added disk usage tracking fields, methods and updated temporary file handling. |
| datafusion/core/tests/memory_limit/mod.rs | Added tests validating disk usage limits for spilling queries. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
alamb
left a comment
There was a problem hiding this comment.
Thanks @2010YOUY01 -- this looks 👨🍳 👌 to me ❤️
| if let Some(writer) = &mut self.writer { | ||
| let (spilled_rows, spilled_bytes) = writer.write(batch)?; | ||
| if let Some(in_progress_file) = &mut self.in_progress_file { | ||
| in_progress_file.update_disk_usage()?; |
There was a problem hiding this comment.
It is quite nice that this is encapsulated as part of InProgressSpillFile
| /// tempfiles are cleaned up. | ||
| #[tokio::test] | ||
| async fn test_disk_spill_limit_not_reached() -> Result<()> { | ||
| let disk_spill_limit = 100 * 1024 * 1024; // 100MB |
There was a problem hiding this comment.
do we really need to generate 100MB to test temporary file space? Could we perhaps lower this to something less resource intensive like 1MB (and reduce the argument to generate_series)?
There was a problem hiding this comment.
5506637 reduced the disk and memory usage of UT to < 1MB
|
Thanks again @2010YOUY01 |
…r spilling queries (apache#15520) * Add disk limit field inside disk manager * Implement disk usage tracking * Update datafusion/execution/src/disk_manager.rs Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Let unit-test use less memory * reduce UT's memory and disk usage to < 1MB * typo --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Which issue does this PR close?
Rationale for this change
See the rationale part of the first attempt PR #14975
The included changes and implementation are different than the above PR, it will be explained below.
What changes are included in this PR?
max_temp_directory_sizefield insideDiskManagerto keep track of current total disk usage for temporary files, by default it's 100GB.RefCountedTempFile:DiskManagerthat created itupdate_disk_usage()to update the global disk usage. After modifying the managed tempfile, the caller also has to call this function to do the update, to make sure when disk limit is exceeded an error will be thrown. (CurrentlyRefCountedTempFileis only used for spill files inside DataFusion, so I think this additional interface is okay to add)Are these changes tested?
Yes, integration test is included for queries exceed/not-exceed the disk limit.
Are there any user-facing changes?