Skip to content

Improve the Backend's disk info report performance #348

@morningman

Description

@morningman

Description

Backend reports disk info regularly to Frontend. Currently, we use boost recursive_directory_iterator to traverse the data dir, get all file's data size as data_used_capacity.
But traverse and get data size of all files takes a lot time. Traverse dirs with total 500 thousand tablets cost about 28 seconds (2,800,000 files), and also impact the IO performance.

Modification

We use data size saved in tablet's header to calculate the data used capacity info. The data size info is all saved in memory. A test show that get the disk info of a Backend with 500 thousand tablets costs about 600 ms, which is 46x faster than traversing the files on disks.

The test also shows that the new method gives the data used capacity of 478G, which the actual data used capacity(by du -sh) is 482G. I think we can accept this slight error.

But there are other directories which are not within our collection range, such as meta/, mini_download/, trash/. These directories may also take a lot of disk space. So we use boost space_info to get the available disk space of each data root path, which only takes half millisecond per one disk, to inform us the left space of the disks.

Other

I decrease the interval of disk and tablet report, from 10 min to 1 min, because they are not that expensive now.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions