-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Fix a race condition issue on reading spilled file #7538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @sarutak . Very excellent debugging ❤️
I think this PR will result in DataFusion leaving many temporary directories around, which is likely not ideal for many users.
I am not sure if you have seen the fix for the test from @viirya in #7534
I have another solution but it breaks the API compatibility of DiskManager::create_tmp_file, which returns a pair of Arc and NamedTempFile to ensure TempDir lives long enough.
I think this is the solution that we should pursue as it is the easiest to use long term. Perhaps we can change the API to return something like
/// A named temporary file and the directory it lives in, which may be cleaned up on drop
struct RefCountedTempFile {
/// directory in which temporary files are created
temp_dir: Arc<TempDir>,
/// the temporary file
tempfile: NamedTempFile,
}
impl RefCountedTempFile {
pub fn path(&self) -> &Path {..}
pub fn inner(&self) -> &NamedTempFile {...}
...
}
impl DiskManager {
pub fn create_tmp_file(
&self,
request_description: &str
) -> Result<RefCountedTempFile, DataFusionError>
...
}Another potential solution I thought of was to pass a reference to the DiskManager to read_spill_as_stream to prevent the file from being deleted. However, that seems like a hard API to use and get right as all callers would have to know temp files were only scoped to the lifetime of the DiskManager
| /// If `Some(vec![])` temporary files will be created in the directories. | ||
| /// If `None` an error will be returned (configured not to spill) | ||
| local_dirs: Mutex<Option<Vec<TempDir>>>, | ||
| local_dirs: Mutex<Option<Vec<PathBuf>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I correct in understanding that the implication here is that the directories in local_dirs will not be removed (though all the contents will be removed, when the NamedTempFile are dropped)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This solution uses directories specified, or returned by std::env::temp_dir(), which is usually /tmp on Linux.
So, these directories are not deleted when the NamedTempFile are dropped.
|
Thank you for the comment @alamb .
I've seen that PR but I don't think it's just an issue of the test.
This is what I implied. If we can change the API, I'll propose the second solution.
Yes, I also thought of this solution. But I thought it's too much. |
Suggested next actions:
|
@alamb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| } | ||
| Ok(()) | ||
| }) | ||
| /// A named temporary file and the directory it lives in, which may be clean up on drop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /// A named temporary file and the directory it lives in, which may be clean up on drop | |
| /// A wrapper around a [`NamedTemporaryFile`] that also contains a reference | |
| /// to the temporary directory it is in. The file is cleaned up on drop. |
| /// A named temporary file and the directory it lives in, which may be clean up on drop | ||
| #[derive(Debug)] | ||
| pub struct RefCountedTempFile { | ||
| /// directory in which temporary files are created |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /// directory in which temporary files are created | |
| /// directory in which temporary files are created (Arc is held to ensure | |
| /// it is not cleaned up prior to the NamedTempFile) |
| } | ||
| } | ||
|
|
||
| /// A wrapper around a [`NamedTemporaryFile`] that also contains a reference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's NamedTempFile, not NamedTemporaryFile. I'll fix it.
Yea, as I commented in #7534 and @alamb's description in #7546, for DataFusion users this is not likely an issue except for the cases like the test that you construct physical query plan (like we did internally also) and execute it. So my idea was to propose a simplest fix to the test. As this doesn't look like a urgent bug to fix, I was thinking to find some time later to propose another fix to avoid |
| } | ||
| } | ||
|
|
||
| /// A wrapper around a [`NamedTempFile`] that also contains a reference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /// A wrapper around a [`NamedTempFile`] that also contains a reference | |
| /// A wrapper around a [`NamedTempFile`] that also contains a reference to its parent temporary directory |
| /// directory in which temporary files are created (Arc is held to ensure | ||
| /// it is not cleaned up prior to the NamedTempFile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /// directory in which temporary files are created (Arc is held to ensure | |
| /// it is not cleaned up prior to the NamedTempFile) | |
| /// The reference to the directory in which temporary files are created to ensure | |
| /// it is not cleaned up prior to the NamedTempFile |
Yeah, thank you for recovering CI quickly! |
|
Thanks again @sarutak |
Which issue does this PR close?
Closes #7537,
Closes #7523
Closes #7546
Rationale for this change
This issue seems a potential race condition issue.
To improve the stability, this issue need to be fixed.
What changes are included in this PR?
This issue happens when the parent directory of spilled files are deleted before being read.
The root cause is
TempDirof the parent directory can be dropped if this async block exits before the reading task starts like as follows.sorterand it's members are dropped includingDiskManager::local_dirs.TempFileinlocal_dirsare dropped then the corresponding temporary directories are deleted recursively.To avoid breaking API compatibility, this change proposes to not useTempDirand let the DiskManager directly creates temporary spilled file inlocal_dirs.UPDATE: After the discussion, I prefer the second solution below.
I have another solution but it breaks the API compatibility of
DiskManager::create_tmp_file, which returns a pair ofArc<TempDir>andNamedTempFileto ensureTempDirlives long enough.Spilled files are represented asNamedTempFileand temporary files are deleted whenNamedTempFile::dropis called.So, I think not using
TempDiris not a problem.Are these changes tested?
Added new test.
Are there any user-facing changes?
No.