-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
Problem Description
Currently, Catalog only saves 'which backend a replica belongs to' info, but does not save the path info of a replica.
Without the path info, We can not precisely control the MIGRATION or SUPPLEMENT when handling load balance or tablet repair. If a Backend has multi-disks, it may cause the situation that one disk is almost full while another disk is empty.
Solution
Report path info via tablet report from Backend, and save it in Catalog.
How to
Backend
Change the tablet report data structure.
Current TReportRequest of tablet is:
map<Types.TTabletId, TTablet>.
We change it to:
map<long, map<Types.TTabletId, TTablet>>.
The outer map key is the hash value of "root path". This modification just slightly increase the data size of tablet report pack.
We use hash value, not the path string as key, because 'long' compare faster than 'string'.
Frontend
Currently we only support full report of tablets. So on each tablet report, we need to compare and set the path of each replicas.
By testing, 500000 times compare and set operations of Long on MacBook Pro(2.3 GHz Intel Core i5) cost 3ms. So this is not a big deal cause handling tablet report is not a latency sensitive operation.
Path info will not persist in catalog, which means only Master FE knows this info. Since all scheduler work is done on Master FE, so non-master FE does not need this info.
Tablet report trigger
Currently, Backend reports tablet info in a fix interval, which means if FE restarts, it may wait at most a report interval to get the path info of replicas.
So we need an active trigger to notify the Backend that 'You need to report tablet info, immediately!' once Master FE restarted or a non-master FE transfer to Master.
Probably, this could be done simply by heartbeat. But the problem is: How to get a Backend realize that the FE restart?
The solution is to add a random integer in heartbeat. Once the Backend get the heartbeat and find nothing changed but this random integer, it should realize that the Master FE restart or is changed. At that moment, Backend reports tablet immediately.
Backend will report tablet once itself restarts. So nothing need to be modified in Backend restart logic.