Skip to content

Conversation

@marising
Copy link
Contributor

@marising marising commented Jul 2, 2020

Features

  1. Find the cache node by SQL Key, then find the corresponding partition data by Partition Key, and then decide whether to hit Cache by LastVersion and LastVersionTime
  2. Refers to the classic cache algorithm LRU, which is the least recently used algorithm, using a three-layer data structure to achieve
  3. The Cache elimination algorithm is implemented by ensuring the range of the partition as much as possible, to avoid the situation of partition discontinuity, which will reduce the hit rate of the Cache partition,
  4. Use the two thresholds of maximum memory and elastic memory to control to avoid frequent elimination of data

Cache fetch

  1. HashMap guarantees to quickly find Cache nodes
  2. Doubly linked list, put the most recently visited node at the bottom and the least visited at the top
  3. The partition data under the Node node is stored in order according to the linked list, and sorted according to the partition key. Considering that the number of requested partitions will not be very large, the two ordered data are combined and the loop is used to find the partition.
  4. Every access, will update the access time of the partition

Cache update

  1. Considering that the amount of updates will be relatively small, the Hash table is used here to find the corresponding partition.
  2. Determine whether the updated version is higher than the existing version. If it is higher, it will be updated, otherwise it will not be updated.

Cache pruning

  1. The number below Part in the figure below represents the timestamp, and the timestamp of the most recent visit is saved
  2. The entire algorithm uses a doubly linked list to find the nodes that have not been accessed recently, eliminate them, and then check them back and forth, and so on, until the memory reaches the standard
  3. As shown in the figure below, find Node1 at the top, then find Part1, and eliminate, then find Part1 of Node2, and eliminate, thus eliminating the partition with timestamp 1-3
  4. Node1 is cleaned up because none of the following parts

be_cache

@morningman morningman added area/sql/execution Issues or PRs related to the execution engine kind/feature Categorizes issue or PR as related to a new feature. labels Jul 2, 2020
Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing partition_cache_test file?

wutiangan
wutiangan previously approved these changes Aug 5, 2020
wutiangan
wutiangan previously approved these changes Aug 10, 2020
wutiangan
wutiangan previously approved these changes Aug 10, 2020
wutiangan
wutiangan previously approved these changes Aug 22, 2020
@marising marising changed the title LRU cache for sql/partition cache #2581 [Cache][BE] LRU cache for sql/partition cache #2581 Sep 16, 2020
1. Cache data by sql key and partition key
2. Support fetch/update/clear operator
3. Hit cache by key, version and vertion_time
@marising marising force-pushed the partition_cache_be branch 3 times, most recently from a951bc4 to 24d78f2 Compare September 18, 2020 10:08
@morningman
Copy link
Contributor

Still has memory leak in UT

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@morningman morningman added the approved Indicates a PR has been approved by one committer. label Sep 19, 2020
@morningman morningman merged commit 5f43fb3 into apache:master Sep 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. area/sql/execution Issues or PRs related to the execution engine kind/feature Categorizes issue or PR as related to a new feature.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants