Skip to content

dscLabJNU/Gecko

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Gecko

Introduction

Gecko, a novel sliding window aggregation algorithm that supports bulk eviction. Gecko leverages a granular-based eviction strategy for various bulk sizes, enabling efficient bulk eviction while maintaining the performance close to that of in-order stream algorithms for single evictions.

For large data bulks, Gecko performs coarse-grained eviction at the chunk level, followed by fine-grained eviction using leftward binary tree aggregation (LTA) as a complementary method. Moreover, Gecko partitions data based on chunks to prevent the impacts of out-of-order data on other chunks, thereby enabling efficient handling of out-of-order data streams.

Gecko has been accepted by TKDE (Transactions on Knowledge and Data Engineering).

If you use Gecko in your research, please cite our TKDE paper:

Jianjun Li, Yuhui Deng, Jiande Huang, Yi Zhou, Qifen Yang, and Geyong Min, "Gecko: Efficient Sliding Window Aggregation with Granular-based Bulk Eviction over Big Data" in IEEE Transactions on Knowledge and Data Engineering. doi: 10.1109/TKDE.2024.3511334.

This repo contains reference implementations of sliding window aggregation algorithms.

All of these algorithms require operators that are associative. We classify the algorithms in two groups: those that require data to arrive in-order, and those that allow data to arrive out-of-order. We refer to the algorithms that require data to arrive in-order as FIFO algorithms, as they assume first-in, first-out semantics. We refer to the algorithms that tolerate disordered data as general algorithms.

The algorithmic complexity of the algorithms is with respect to the size of the window, n.

A tutorial and encyclopedia article provide more background on sliding window aggregation algorithms.

DABA

DABA Lite

FiBA

FlatFIT

IOA

Two-Stacks

  • full name: Two-Stacks
  • ordering: in-order required
  • operator requirements: associativity
  • time complexity: worst-case O(n), amortized O(1)
  • space requirements: 2n
  • first appeared: adamax on Stack Overflow
  • implementions: C++, Rust

Two-Stacks Lite

Reactive

Recalc

  • full name: Re-Calculate From Scratch
  • ordering: out-of-order allowed
  • operator requirements: none
  • time complexity: O(n)
  • space requirements: n
  • first appeared: no known source
  • implementations: C++, Rust

SOE

  • full name: Subtract on Evict
  • ordering: out-of-order allowed
  • operator requirements: associativity, invertability
  • time complexity: worst-case O(1)
  • space requirements: n
  • first appeared: no known source
  • implementations: C++ (strictly in-order), Rust (strictly in-order)

Amortized MTA (AMTA)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors