This is a Guava compliant caching implementation, mainly focused on large volume in process caching with minimal GC overhead. It's based on MemoryMapped files, and inspired by Bitcask (Erlang) & LevelDB (C++) while remaining purely in Java.
The sequential writes allow us to get write performance around 3μs for 100bytes key/value. The lock free reads allow us to get read performance around 2μs whether sequential or random
Yet production ready, try at your own risk; Well, first batch of results coming out:
LevelDB: iq80 leveldb version 0.4
Date: Wed Mar 05 16:41:13 PST 2014
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 10000000
RawSize: 1106.3 MB (estimated)
FileSize: 629.4 MB (estimated)
------------------------------------------------
fillseq : 0.93751 micros/op; 118.0 MB/s
fillseq : 2.59594 micros/op; 42.6 MB/s
fillsync : 5.31380 micros/op; 20.8 MB/s (10000 ops)
fillrandom : 2.87762 micros/op; 38.4 MB/s
fillseq : 3.56622 micros/op; 31.0 MB/s
overwrite : 2.89620 micros/op; 38.2 MB/s
fillseq : 2.92466 micros/op; 37.8 MB/s
readseq : 1.33068 micros/op; 83.1 MB/s
readrandom : 1.79489 micros/op; 61.6 MB/s
readrandom : 1.78563 micros/op; 62.0 MB/s
readseq : 1.08474 micros/op; 102.0 MB/s
readrandom : 1.80890 micros/op; 61.2 MB/s
readseq : 1.09180 micros/op; 101.3 MB/s
- Guava Cache API supported features except for #asMap (guess why, and you do have CacheBuilder, TTL, stats, removal notification etc.)
- LRW | LRU eviction strategy
- RAM, Disk consumption constraints
- Transient or Persistent caching based on configuration
- Data Integrity protection via optional binary document checksum
final Cache<String, String> cache = CacheBuilder.builder(Serializers.STRING_SERIALIZER, Serializers.STRING_SERIALIZER)
.build();- Only the
Serializeris mandatory, everything else got meaningful defaults, and could be further configured. - You got yourself a transient
Cache, ready to acceptPut,Getetc. The temporary files created will be released at JVM exit.
- The primary API of the bluewhale caching is
org.ebaysf.bluewhale.Cacheandorg.ebaysf.bluewhale.configurable.CacheBuildersimilar tocom.google.common.cache.* - Functional wise, bluewhale caching is mostly compliant with Guava's Cache, but you must provide the key/value
org.ebaysf.bluewhale.serialization.Serializerin addition. - And there's various configurations you could tune based on
org.ebaysf.bluewhale.configurable.Configuration, more details will be explained below. - The structure of bluewhale caching is simple, a
com.google.common.collect.RangeMap<Integer, Segment>, and acom.google.common.collect.RangeMap<Integer, BinJournal>at the essence of it. - A
Segmentis conceptually along[], hashCode is broken tosegmentCodeandsegmentOffsetto fetch a long value from the correspondingSegment. - The long value is broken to
journalCodeandjournalOffsetsimilarly, pointing to aBinDocumentstored in someBinJournal. - The
BinDocumentread contains the serializedkey,value,hashCode, etc. which requires the sameSerializerto deserialize it to the expected value type. - Cold cache is supported via
org.ebaysf.bluewhale.persistence.*, using GSON to manifestSegmentandJournalin readable json format, and allow a load back from the same file it writes to whenever there's any structural change.
- key
Serializermust be provided, checkorg.ebaysf.bluewhale.serialization.Serializersfor existing types' support. - value
Serializermust be provided. - concurrencyLevel
intmanages the number ofSegmentto be initialized, default value is3, which creates2 << 3 = 8segments - maxSegmentDepth
intmanages the width ofSegment, default value is2, which means eachSegmentinitialized cannot be splitted more than2times - maxPathDepth
intmanages the depth ofPath, default value is7, which triggers aPathshorten request whenever aPathis deeper than7 - factory
BinDocumentFactoryis the factory class to createBinDocuments, default isBinDocumentFactories.RAWwhich createsBinDocumentRawimplementations with no checksums. - journalLength
intmanages the bytes of each journal file, default is1 << 29 = 512MB, it must be postive but not overInteger.MAX_VALUE. - maxJournals
intmanages the total number of journal files the cache will maintain, default is8 => 4G, whenever more journal files are present, eviction must happen to retire old journals. - maxMemoryMappedJournals
intmanages the total number of RAM we will use as memory mapped files, default is2 => 1G, whenever more memory mapped journals are present, downgrade must happen, checkorg.ebaysf.bluewhale.storage.BinStorageImpl#downgradefor details. - leastJournalUsageRatio
floatmanages the threshold of journal compaction, default is0.1f, when a journal's usage ratio is below or equal to this value, this journal will be compacted. - dangerousJournalsRatio
floatmanages the threshold of LRU eviction strategy triggering, default is0.25f, when the cache is LRU configured, and theBinDocumentread is in dangerous journals, it will be refreshed, checkorg.ebaysf.bluewhale.storage.UsageTrack#refresh. - ttl
Pair<Long, TimeUnit>manages the expiration of journals, default isnull, means it never expires. Otherwise, a journal will be evicted if it's older thannow - TimeUnit.toNanos(Long). - evictionStrategy
EvictionStrategymanages the eviction policy before cache maxed out, default isEvictionStrategy.SILENCEwhich is effectively LRW,EvictionStrategy.LRUis an alternative. - eventBus
EventBusmanages the event handling, default isnew com.google.common.eventbus.EventBus(). - executor
ExecutorServicemanages the async jobs processing, default isjava.util.concurrent.Executors.newCachedThreadPool(). - local
Filemanages the directory of the cache to put itsSegment,BinJournalfiles, default iscom.google.common.io.Files.newTempDir(). - cold
Filemanages the source/destination of the cold cache to be loaded/persisted, default isnull, means no persistence will be done. - persistent
booleanmanages the persistence, when cold is enabled, it must betrue, default isfalse, and theSegment,BinJournalfiles will be deleted on JVM exit.
- There're various tunings we could do given the configurations above, the most effective, and costly would be to allocate more RAM, using maxMemoryMappedJournals
- From
2to4, which effectively increased the RAM allocation from1Gto2G, the very same benchmark result becomes the following:
LevelDB: iq80 leveldb version 0.7
Date: Tue Mar 11
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 10000000
RawSize: 1106.3 MB (estimated)
FileSize: 629.4 MB (estimated)
------------------------------------------------
fillseq : 0.83689 micros/op; 132.2 MB/s
fillseq : 1.30750 micros/op; 84.6 MB/s
fillsync : 2.09480 micros/op; 52.8 MB/s (10000 ops)
fillrandom : 2.06856 micros/op; 53.5 MB/s
fillseq : 2.29767 micros/op; 48.1 MB/s
overwrite : 1.35896 micros/op; 81.4 MB/s
fillseq : 1.42896 micros/op; 77.4 MB/s
readseq : 0.40230 micros/op; 275.0 MB/s
readrandom : 0.94043 micros/op; 117.6 MB/s
readrandom : 0.94592 micros/op; 117.0 MB/s
readseq : 0.39215 micros/op; 282.1 MB/s
readrandom : 0.93712 micros/op; 118.0 MB/s
readseq : 0.40206 micros/op; 275.2 MB/s