Skip to content

Opening data dirs parallelly is not thread-safe #3282

@vagetablechicken

Description

@vagetablechicken

Describe the bug
162b1c5 makes the data dirs init parallelly, but I found it's not thread-safe.
When be is starting, it may get this error:

[data_dir.cpp:249] fail to find file system, path=/xxx

I didn't notice that DataDir::_init_file_system() may have race conditions.
getmntent() value is MT-Unsafe race:mntentbuf locale
so mount_entry may be the wrong value NULL , is_find will be false.
https://github.com/apache/incubator-doris/blob/614a76beeac73821c78903c46e7a703b7956796b/be/src/olap/data_dir.cpp#L228-L251

We have two solutions:

  1. use getmntent_r() instread, getmntent_r() value is MT-Safe locale.
  2. split _init_meta()(open rocksdb), just _init_meta() in parallel. Keep other parts of data dir init in sequential order.

Metadata

Metadata

Labels

area/storageIssues or PRs related to storage enginekind/fixCategorizes issue or PR as related to a bug.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions