Hi all
I am using Riak 1.4.2 + levelDB and I find something strange.
We have one index called 'expiration_epoch_int', it is like a TTL for a particular key in this bucket. To find expired data to delete is just query over expiration_epoch_int between 0 and 'now'. For a small amont of data it seems really good.
But today I find this: the first 10 results from this query return non-existent keys. It was already deleted. I receive one 'HTTP/1.1 404 Object Not Found' if I try to inspect.
If I use a small range, like around +/- 1 second from now, I can find good results (keys who exists in Riak) but if I start from 0 ( or 1) at least the begining are keys who does not exist.
If I use return_terms=true I can find the expiration_epoch_int too (it returns data between 10 and 20 days ago). I am using the PBC interface for query and delete.
So, my question: why this happens? can be related to pagination (maybe some cache)? When we perform our cleanup process, we process, for example, ~7 x10^6 keys.
To control the expiration of a huge amount of data, it is save use only one secondary index? There is some limit for a huge number of keys? I have no idea where I can start to investigate this.
I will try to run a more complete test to find the % of deleted keys returned by Riak.
Hi all
I am using Riak 1.4.2 + levelDB and I find something strange.
We have one index called 'expiration_epoch_int', it is like a TTL for a particular key in this bucket. To find expired data to delete is just query over expiration_epoch_int between 0 and 'now'. For a small amont of data it seems really good.
But today I find this: the first 10 results from this query return non-existent keys. It was already deleted. I receive one 'HTTP/1.1 404 Object Not Found' if I try to inspect.
If I use a small range, like around +/- 1 second from now, I can find good results (keys who exists in Riak) but if I start from 0 ( or 1) at least the begining are keys who does not exist.
If I use return_terms=true I can find the expiration_epoch_int too (it returns data between 10 and 20 days ago). I am using the PBC interface for query and delete.
So, my question: why this happens? can be related to pagination (maybe some cache)? When we perform our cleanup process, we process, for example, ~7 x10^6 keys.
To control the expiration of a huge amount of data, it is save use only one secondary index? There is some limit for a huge number of keys? I have no idea where I can start to investigate this.
I will try to run a more complete test to find the % of deleted keys returned by Riak.