Skip to content

The BE storage_root_path is set to a soft symlink, causing the load task to stay in the QUORUM_FINIHSED state #307

@leijian

Description

@leijian

Describe the bug
Two problems:

  1. The BE storage_root_path is set to a soft symlink directory. The new load task stays in the QUORUM_FINIHSED state. A large number of clone failure errors are reported in be.INFO.

I1110 18:46:57.581041 28672 task_worker_pool.cpp:838] get clone task. signature: 13779
I1110 18:46:57.581109 28672 command_executor.cpp:508] begin to process obtain root path. [storage_medium=0]
I1110 18:46:57.585311 28672 command_executor.cpp:535] success to process obtain root path. [path='/data1/hcp-palo-be/data/data/0']
I1110 18:46:57.585330 28672 task_worker_pool.cpp:1035] pre make snapshot. backend_ip: 10.39.11.81
I1110 18:46:57.587469 28672 task_worker_pool.cpp:1058] make snapshot success. backend_ip: 10.39.11.81, src_file_path: /data1/palo_be/data/snapshot/20181110184657.8024/, signature: 13779
I1110 18:46:57.616843 28672 task_worker_pool.cpp:881] clone copy done, src_host: 10.39.11.81, src_file_path: /data1/palo_be/data/snapshot/20181110184657.8024/
I1110 18:46:57.616852 28672 command_executor.cpp:568] begin to process load headers. [tablet_id=13779 schema_hash=184775725]
W1110 18:46:57.616866 28672 olap_engine.cpp:135] fail to find header file. [header_path=/data1/hcp-palo-be/data/data/0/13779/184775725/13779.hdr]
I1110 18:46:57.617010 28672 utils.cpp:1085] remove empty dir /data1/hcp-palo-be/data/data/0/13779
W1110 18:46:57.617056 28672 command_executor.cpp:579] fail to process load headers. [res=-102]
W1110 18:46:57.617079 28672 task_worker_pool.cpp:892] load header failed. local_shard_root_path: /data1/hcp-palo-be/data/data/0, schema_hash: 184775725, status: -102, signature: 13779
I1110 18:46:57.617096 28672 task_worker_pool.cpp:907] clone failed. want to delete local dir: /data1/hcp-palo-be/data/data/0/13779/184775725, signature: 13779
W1110 18:46:57.617107 28672 task_worker_pool.cpp:992] clone failed. signature: 13779
I1110 18:46:57.617470 28672 task_worker_pool.cpp:302] finish task success.result: 0
I1110 18:46:57.617481 28672 task_worker_pool.cpp:262] type: 3, signature: 13779 has been erased. queue size: 0
I1110 18:47:17.582481 28672 task_worker_pool.cpp:838] get clone task. signature: 13779
  1. In order to solve the above problem 1, the BE storage_root_path is reassigned as a non-symlink directory,and then restart BE. The historical task that stays in the loading state is blocked in the loading state.

To Reproduce
Steps to reproduce the problem 1:

  1. ln -sf /opt/palo-data /opt/palo-new-data
  2. set storage_root_path=/opt/palo-new-data
  3. start BE process

Steps to reproduce the problem 2:

  1. set storage_root_path=/opt/palo-data
  2. restart BE process

Additional context

  • Palo Version: 0.8.2
  • Linux Version: CentOS 7.3.1611
  • Doris-BE Running on Docker Version: 1.6.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/fixCategorizes issue or PR as related to a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions