AI-native distributed database built from scratch in Rust. ArqonDB unifies key-value storage, vector search (DiskHNSW / SPFresh, PQ-encoded), and temporal graph traversal in a single engine — powered by Raft consensus, LSM-tree compaction, and a sharded metadata plane.
- Unified engine — KV, vector, and temporal graph in one process. No glue code between three separate systems.
- 6x faster writes than RocksDB on single-node benchmarks with WAL durability.
- Built for AI agents — causal graph, reactive state, and CAS primitives designed for agent memory and planning.
- Pure Rust, zero C++ deps — single static binary, no JNI, no CGO.
- Production topology — Raft consensus, sharded metadata, stateless gateway, Redis RESP2 compatible.
| Storage | LSM-tree with leveled compaction, MVCC, bloom filters, sharded block cache |
| Vector | DiskHNSW / SPFresh with PQ encoding, distributed fan-out search |
| Graph | Temporal edge traversal (BFS), GraphSST with temperature-based zoning |
| Consensus | Per-shard Raft groups + separate metadata Raft plane |
| Interfaces | gRPC, Redis RESP2, REST management API, React UI |
| SDKs | Python, Java, Rust, Go, C++, Node.js |
ArqonDB matches or outperforms RocksDB on all single-node benchmarks. Both use page-cache WAL durability (sync=false) — ArqonDB reuses its Raft log double-buffer WAL engine for standalone mode.
| Benchmark | ArqonDB | RocksDB | Ratio |
|---|---|---|---|
| Sequential write (10K keys) | 5.29 ms | 33.99 ms | 6.4x faster |
| Sequential read (10K keys) | 4.20 ms | 9.56 ms | 2.3x faster |
| Random read (10K keys) | 5.40 ms | 9.01 ms | 1.7x faster |
| Sequential write + flush (100K x 1KB) | 105.40 ms | 462.95 ms | 4.4x faster |
cargo bench --bench kv_benchmark┌─────────────────────────────────────────────────────────────┐
│ Clients │
└───────────────────────┬─────────────────────────────────────┘
│ gRPC
▼
┌─────────────────────────────────────────────────────────────┐
│ Gateway (stateless) │
│ shard-map cache + leader retry + vector merge │
└───────────────────────┬─────────────────────────────────────┘
│
┌─────────────┴──────────────┐
│ │
▼ ▼
┌──────────────────┐ ┌──────────────────────────────────┐
│ Metadata Plane │ │ Data Plane │
│ (arqondb-meta) │ │ (arqondb + data-node) │
│ │ │ │
│ Raft group │ │ ShardEngine per node │
│ MetadataState │ │ LSM-tree per shard │
│ ShardMap │ │ HNSW + PQ vector index │
│ │ │ Raft per shard group │
└──────────────────┘ └──────────────────────────────────┘
| Binary | Feature Flag | Role |
|---|---|---|
metadata_service |
(none) | Standalone metadata Raft group |
raft_engine |
data-node |
Data node: ShardEngine + gRPC KV server |
gateway |
(none) | Stateless routing gateway + management UI |
src/
├── engine/
│ ├── mem/ # MemTable: skip-list backed, MVCC-ordered
│ ├── sst/ # SST files: data blocks, index blocks, bloom filters
│ ├── wal/ # Write-ahead log: record framing + CRC
│ ├── version/ # VersionSet: LSM level management, compaction
│ ├── background/ # Background compaction and flush tasks
│ ├── vector/ # HNSW + PQ vector index: ANN search per shard
│ └── shard/ # ShardEngine: maps metadata events → local LSM shards
│
├── raft/
│ ├── node.rs # RaftNode (public handle) + RaftCore (event loop)
│ ├── log.rs # RaftLog: 1-indexed, sentinel at [0]
│ ├── state.rs # RaftRole, RaftState transitions
│ └── transport.rs # Lazy gRPC connections to peers
│
├── metadata/
│ ├── state.rs # MetadataState: shards, CFs, node registry
│ ├── op.rs # MetadataOp variants (CreateShard, RegisterNode, …)
│ ├── manager.rs # MetadataManager: Raft-backed metadata
│ ├── provider.rs # MetadataProvider trait (local vs remote)
│ └── router.rs # ShardRouter: (cf, key) → ShardInfo
│
├── network/
│ ├── grpc_service.rs # KV gRPC service (GrpcKvService + GrpcShardKvService)
│ ├── redis_service.rs # Redis-compatible TCP server (RESP2 protocol)
│ ├── raft_service.rs # Raft RPC handler
│ ├── metadata_service.rs # Metadata gRPC service
│ ├── metadata_client.rs # MetadataClient (remote MetadataProvider)
│ └── gateway_service.rs # Stateless routing gateway
│
└── db/
└── db_impl.rs # DBImpl: write group, WAL, memtable, compaction
- Rust 1.85+ (
rustup update stable) protocis not required —protoc-bin-vendoredbundles a prebuilt binary
# Library + metadata + gateway binaries
cargo build
# Data node (requires data-node feature)
cargo build --features data-node --bin raft_engine
# All binaries
cargo build --features data-node
# Build the web UI
cd src/ui && npm install && npm run build# All tests (~920 tests)
cargo test
# Integration tests (20 tests)
cargo test --test integration_testArqonDB includes a Redis-compatible TCP server (RedisServer) that speaks RESP2 — the same wire protocol used by Redis itself. Any existing Redis client library or redis-cli can connect without modification.
RedisServer is generic over the KvOps trait, so it plugs into two different positions:
Option A — inside the Gateway (recommended for production):
redis-cli ──RESP2──► RedisServer(GatewayService)
│
metadata shard lookup
│
┌─────────▼──────────┐
│ data node (leader) │
└────────────────────┘
Option B — on a single data node (simple / dev):
redis-cli ──RESP2──► RedisServer(KvService) ──► local LSM-tree
In Option A the Redis client gets exactly the same routing, leader-retry, and fault-tolerance as gRPC clients — there is no extra hop or intermediate service.
| Command | Description |
|---|---|
SET key value [EX s|PX ms|EXAT ts|PXAT ts|KEEPTTL] [NX|XX] [GET] |
Store a key/value pair with optional TTL and conditional semantics |
GET key |
Get value, or (nil) if absent or expired |
MSET key value [key value …] |
Set multiple keys |
MGET key [key …] |
Get multiple values (array reply) |
GETDEL key |
Get value then delete the key |
STRLEN key |
Length of stored value (0 if absent) |
APPEND key value |
Merge value into key (append-style merge) |
EXISTS key [key …] |
Count how many of the given keys exist (expired keys not counted) |
DEL key [key …] |
Delete keys; returns count deleted |
TYPE key |
Returns "string" or "none" |
| Command | Description |
|---|---|
EXPIRE key seconds |
Set expiry in seconds; returns 1 if set, 0 if key not found |
PEXPIRE key milliseconds |
Set expiry in milliseconds |
EXPIREAT key unix-time-seconds |
Set absolute expiry (Unix timestamp in seconds) |
PEXPIREAT key unix-time-ms |
Set absolute expiry (Unix timestamp in milliseconds) |
TTL key |
Remaining seconds; -1 = no expiry, -2 = key not found |
PTTL key |
Remaining milliseconds; -1 = no expiry, -2 = key not found |
PERSIST key |
Remove expiry; returns 1 if removed, 0 if no expiry / no key |
Expiry is enforced lazily on reads: expired keys are transparently deleted when accessed and return (nil) / 0 / "none" as appropriate.
TTL metadata is stored in an internal column family (CF 1) so it survives restarts and is replicated through Raft like any other write.
| Command | Description |
|---|---|
PING [message] |
Returns PONG (or echoes message) |
ECHO message |
Echo the message back |
QUIT |
Close the connection |
SELECT db |
No-op (only SELECT 0 accepted) |
| Command | Description |
|---|---|
DBSIZE |
Returns 0 (full scan not yet implemented) |
INFO [section] |
Returns basic server info |
COMMAND COUNT |
Returns number of supported commands |
COMMAND DOCS / INFO |
Empty array (compatibility shim) |
FLUSHDB / FLUSHALL |
Returns -ERR (destructive; not supported) |
| Category | Commands |
|---|---|
| Atomic ops | INCR, DECR, SETNX, GETSET, … |
| Lists | LPUSH, RPUSH, LRANGE, … |
| Hashes | HSET, HGET, HMGET, … |
| Sets | SADD, SMEMBERS, … |
| Sorted sets | ZADD, ZRANGE, … |
| Pub/Sub | SUBSCRIBE, PUBLISH, … |
| Transactions | MULTI, EXEC, … |
| Scripting | EVAL, EVALSHA, … |
| Key iteration | KEYS, SCAN |
All key commands operate on USER_COLUMN_FAMILY_ID (CF 0).
# Terminal 1: metadata server
cargo run --bin metadata_service
# Terminal 2: data node
META_SERVER=http://127.0.0.1:8379 DATA_NODE_ID=1 \
DATA_ADDR=http://127.0.0.1:7379 RAFT_ADDR=127.0.0.1:7380 \
cargo run --features data-node --bin raft_engine -- /tmp/node1 0.0.0.0:7379
# Terminal 3: gateway — enable Redis on port 6379
GATEWAY_META=http://127.0.0.1:8379 \
GATEWAY_REDIS_ADDR=0.0.0.0:6379 \
cargo run --bin gateway
# Any terminal: works with redis-cli out of the box
redis-cli -p 6379 SET hello world
redis-cli -p 6379 GET hello # → "world"
redis-cli -p 6379 DEL hello| Variable | Default | Description |
|---|---|---|
GATEWAY_REDIS_ADDR |
0.0.0.0:6379 |
TCP address for the Redis-compatible listener |
// Single-node (direct DB access)
use arqondb::{DBImpl, network::{KvService, redis_service::RedisServer}};
let svc = KvService::new(DBImpl::open("/tmp/mydb").unwrap());
RedisServer::new(svc).serve("0.0.0.0:6379").await.unwrap();
// Inside gateway (shard-routed)
use arqondb::network::{gateway_service::GatewayService, redis_service::RedisServer, MetadataClient};
let (meta, _sub) = MetadataClient::connect("http://127.0.0.1:8379".to_string()).await?;
RedisServer::new(GatewayService::new(meta)).serve("0.0.0.0:6379").await.unwrap();The ArqonDb gRPC service (defined in proto/arqondb.proto) exposes the following key-value operations:
| RPC | Description |
|---|---|
Put(PutRequest) |
Write a single key-value pair |
Get(GetRequest) |
Read a key (returns found=false when absent) |
Delete(DeleteRequest) |
Delete a single key |
Merge(MergeRequest) |
Merge an operand into an existing value |
BatchWrite(BatchWriteRequest) |
Atomically apply a batch of Put/Delete/Merge |
Scan(ScanRequest) |
Scan keys with optional prefix filter (paginated) |
DeleteByPrefix(DeleteByPrefixRequest) |
Delete all keys matching a prefix |
rpc DeleteByPrefix(DeleteByPrefixRequest) returns (DeleteByPrefixResponse);| Field | Type | Description |
|---|---|---|
cf |
uint32 |
Column family id (0 = default user CF) |
prefix |
bytes |
Key prefix to match (must be non-empty) |
Returns deleted (uint32) — the number of keys that were deleted. In distributed mode the gateway fans out to all shards in the column family and aggregates the count.
ArqonDB includes a built-in HNSW (Hierarchical Navigable Small World) vector index for approximate nearest neighbor (ANN) search, with optional Product Quantization (PQ) for memory-efficient large-scale search. Each node manages named vector indices in memory, accessible via gRPC. The gateway provides distributed vector search: fan-out queries to all nodes hosting an index, then merge results by distance to return the global top-k.
Client
│ VectorSearch(query, k=10)
▼
┌──────────────────┐
│ Gateway │
│ index metadata │──── VectorIndexMeta
│ lookup │ { name, node_ids }
└────────┬─────────┘
fan-out to index nodes only
┌─────────────┼─────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Node 1 │ │ Node 2 │ │ Node 3 │
│ top-10 │ │ top-10 │ │ top-10 │
└────┬─────┘ └────┬─────┘ └────┬─────┘
└─────────────┼─────────────┘
▼
merge by distance
return top-10
Per node:
┌─────────────────────────────────┐
│ VectorIndexManager │
│ manages named indices (HNSW │
│ or PQ-HNSW) │
└──────────┬──────────────────────┘
│
┌─────────────┴─────────────┐
│ │
┌─────────────┐ ┌────────────────────┐
│ HnswIndex │ │ PqHnswIndex │
│ Full f32 │ │ HNSW + PQ codes │
│ < 100K vec │ │ > 100K vectors │
└─────────────┘ └────────────────────┘
Plain HNSW stores full f32 vectors in every graph node. Simple, highest accuracy, but memory-intensive for large datasets.
PQ-HNSW adds Product Quantization on top of HNSW:
- Construction: uses exact distances for graph quality (no accuracy loss during build)
- Search: uses Asymmetric Distance Computation (ADC) via PQ codes for fast beam traversal, then reranks top candidates with exact distances
- Memory: 128D vectors drop from 512 bytes → 32 bytes per vector (16x savings)
| Metric | Proto value | Description |
|---|---|---|
| L2 | VECTOR_L2 |
Squared Euclidean distance |
| Cosine | VECTOR_COSINE |
1 − cosine similarity |
| Inner Product | VECTOR_INNER_PRODUCT |
Negative dot product (smaller = more similar) |
All vector RPCs are part of the ArqonDb gRPC service defined in proto/arqondb.proto.
rpc CreateVectorIndex(CreateVectorIndexRequest) returns (CreateVectorIndexResponse);| Field | Type | Description |
|---|---|---|
index_name |
string |
Unique name for this index |
config.dim |
uint32 |
Vector dimensionality (required, > 0) |
config.metric |
VectorDistanceMetric |
Distance metric (default: L2) |
config.m |
uint32 |
Max connections per graph layer (default: 16) |
config.ef_construction |
uint32 |
Build-time search width (default: 200, higher = better recall) |
config.ef_search |
uint32 |
Default query-time search width (default: 64) |
rpc VectorPut(VectorPutRequest) returns (VectorPutResponse);Re-inserting the same vector_id replaces the previous vector.
rpc VectorDelete(VectorDeleteRequest) returns (VectorDeleteResponse);rpc VectorSearch(VectorSearchRequest) returns (VectorSearchResponse);| Field | Type | Description |
|---|---|---|
query |
repeated float |
Query vector (must match index dimension) |
k |
uint32 |
Number of nearest neighbors to return |
ef_search |
uint32 |
Override search width for this query (0 = use index default) |
Returns a list of VectorSearchResult { id, distance } sorted by distance ascending.
rpc VectorGet(VectorGetRequest) returns (VectorGetResponse);rpc DropVectorIndex(DropVectorIndexRequest) returns (DropVectorIndexResponse);When running multiple data nodes behind the gateway, vector indexes are automatically distributed:
| RPC | Routing strategy |
|---|---|
CreateVectorIndex |
Broadcast to all nodes; register index → node_ids in metadata |
DropVectorIndex |
Send to nodes hosting the index; remove from metadata |
VectorPut / VectorDelete |
Hash vector_id to pick one node within the index's node set |
VectorGet |
Hash vector_id to the owning node |
VectorSearch |
Fan-out to all nodes hosting the index, merge by distance, return global top-k |
The gateway tracks which nodes host each index via VectorIndexMeta in the metadata Raft group. Only nodes that actually host an index participate in fan-out queries — no unnecessary broadcast to the entire cluster.
import grpc
from arqondb_pb2 import *
from arqondb_pb2_grpc import ArqonDbStub
channel = grpc.insecure_channel("127.0.0.1:7379")
stub = ArqonDbStub(channel)
# Create a 128-dim L2 index
stub.CreateVectorIndex(CreateVectorIndexRequest(
index_name="embeddings",
config=VectorIndexConfig(dim=128, metric=VECTOR_L2),
))
# Insert vectors
for i in range(1000):
stub.VectorPut(VectorPutRequest(
index_name="embeddings",
vector_id=i,
vector=[float(x) for x in range(128)], # your embedding here
))
# Search
resp = stub.VectorSearch(VectorSearchRequest(
index_name="embeddings",
query=[0.0] * 128,
k=10,
))
for r in resp.results:
print(f"id={r.id} distance={r.distance:.4f}")use arqondb::engine::vector::{HnswConfig, HnswIndex, DistanceMetric};
let config = HnswConfig::new(128, DistanceMetric::Cosine);
let index = HnswIndex::new(config);
index.insert(1, vec![0.1; 128]);
index.insert(2, vec![0.2; 128]);
let results = index.search(&vec![0.15; 128], 5, None);
for r in &results {
println!("id={} distance={:.4}", r.id, r.distance);
}use arqondb::engine::vector::{
PqHnswConfig, PqHnswIndex, PqConfig, HnswConfig, DistanceMetric,
};
// Configure HNSW graph + PQ compression
let config = PqHnswConfig {
hnsw: HnswConfig::new(128, DistanceMetric::L2),
pq: PqConfig {
dim: 128,
num_sub: 32, // 32 sub-quantizers of 4D each
num_centroids: 256, // 256 centroids per sub-quantizer → 1 byte per sub
metric: DistanceMetric::L2,
max_iter: 20,
},
rerank_k: 100, // rerank top-100 ADC candidates with exact distances
};
let index = PqHnswIndex::new(config);
// Insert vectors (graph built with exact distances)
for i in 0..10000u64 {
index.insert(i, embedding(i));
}
// Train PQ codebooks on the inserted vectors
index.train();
// Search: ADC beam search → exact rerank → top-k
let results = index.search(&query, 10, None);
// Persistence
let bytes = index.to_bytes();
let restored = PqHnswIndex::from_bytes(&bytes).unwrap();use arqondb::engine::vector::{HnswIndex, PqHnswIndex, PqConfig, DistanceMetric};
// Start with a plain HNSW index
let hnsw = HnswIndex::new(/* ... */);
// ... insert vectors ...
// Wrap it with PQ (trains codebooks + encodes all vectors)
let pq_config = PqConfig {
dim: 128, num_sub: 32, num_centroids: 256,
metric: DistanceMetric::L2, max_iter: 20,
};
let pq_index = PqHnswIndex::from_hnsw(hnsw, pq_config, 100);| Parameter | Default | Effect |
|---|---|---|
m |
16 | Higher = better recall, more memory, slower insert |
ef_construction |
200 | Higher = better graph quality, slower build |
ef_search |
64 | Higher = better recall, slower query. Must be ≥ k |
dim |
128 | Must match your embedding model output dimension |
| Parameter | Default | Effect |
|---|---|---|
num_sub |
32 | Number of sub-quantizers. Higher = less compression, more accuracy. Must divide dim evenly |
num_centroids |
256 | Centroids per sub-quantizer (max 256). Higher = better approximation, slower training |
max_iter |
20 | k-means training iterations. More = better codebooks, slower training |
rerank_k |
100 | Candidates reranked with exact distances. Higher = better recall, slower search. Use 2-4x your typical k |
Typical settings by use case:
| Use case | dim | metric | m | ef_construction | ef_search | PQ num_sub | rerank_k |
|---|---|---|---|---|---|---|---|
| OpenAI text-embedding-3-small | 1536 | Cosine | 16 | 200 | 100 | 192 (8D each) | 200 |
| Sentence-BERT | 384 | Cosine | 16 | 200 | 64 | 48 (8D each) | 100 |
| Image embeddings (CLIP) | 512 | InnerProduct | 24 | 300 | 128 | 64 (8D each) | 200 |
| Low-latency (< 1ms) | any | L2 | 8 | 100 | 32 | dim/4 | 50 |
| Memory-constrained (1M+ vectors) | 128 | L2 | 16 | 200 | 64 | 32 (4D each) | 100 |
When to use PQ-HNSW vs plain HNSW:
| Scenario | Recommendation |
|---|---|
| < 100K vectors | Plain HNSW — simpler, no accuracy trade-off |
| 100K–10M vectors | PQ-HNSW — 16x memory savings with minimal recall loss |
| Recall > 99% required | Plain HNSW, or PQ-HNSW with high rerank_k |
| Latency-sensitive search | PQ-HNSW — ADC table lookups faster than f32 distance |
ArqonDB provides official gRPC client SDKs for five languages. Each SDK wraps the ArqonDb gRPC service and provides idiomatic APIs for KV operations, batch writes, and vector index management.
| Language | Path | Transport | Min Version |
|---|---|---|---|
| Python | sdk/python/ |
grpcio | Python 3.9+ |
| Java | sdk/java/ |
grpc-java + Netty | Java 17+ |
| Rust | sdk/rust/ |
tonic | Rust 1.70+ |
| Go | sdk/go/ |
grpc-go | Go 1.21+ |
| C++ | sdk/cpp/ |
gRPC C++ | C++17 |
cd sdk/python && pip install grpcio grpcio-tools protobuf && make protofrom arqondb import ArqonDBClient
with ArqonDBClient("127.0.0.1:7379") as client:
client.put(b"hello", b"world")
print(client.get(b"hello")) # b"world"
client.delete(b"hello")cd sdk/java && cp ../../proto/arqondb.proto src/main/proto/ && mvn clean compiletry (ArqonDBClient client = new ArqonDBClient("127.0.0.1", 7379)) {
client.put("hello".getBytes(), "world".getBytes());
byte[] val = client.get("hello".getBytes()).orElse(null);
client.delete("hello".getBytes());
}# Cargo.toml
[dependencies]
arqondb-client = { path = "sdk/rust" }
tokio = { version = "1", features = ["rt-multi-thread", "macros"] }let mut client = ArqonDBClient::connect("http://127.0.0.1:7379").await?;
client.put(b"hello", b"world", None).await?;
let val = client.get(b"hello", None).await?;
client.delete(b"hello", None).await?;cd sdk/go && make protoclient, _ := arqondb.NewClient("127.0.0.1:7379")
defer client.Close()
client.Put(ctx, []byte("hello"), []byte("world"))
val, _ := client.Get(ctx, []byte("hello"))
client.Delete(ctx, []byte("hello"))cd sdk/cpp && mkdir build && cd build && cmake .. && makearqondb::Client client("127.0.0.1:7379");
client.put("hello", "world");
auto val = client.get("hello"); // std::optional<std::string>
client.del("hello");All SDKs support the full API: ping, put, get, delete, merge, batch_write, scan, delete_by_prefix, create_vector_index, drop_vector_index, vector_put, vector_delete, vector_search, vector_get. See each SDK's README for detailed documentation.
Start all three services, then use the web console to create column families and run KV operations.
# Terminal 1: metadata server
RUST_LOG=info cargo run --bin metadata_service
# Terminal 2: data node
RUST_LOG=info \
META_SERVER=http://127.0.0.1:8379 \
DATA_NODE_ID=1 \
DATA_ADDR=http://127.0.0.1:7379 \
RAFT_ADDR=127.0.0.1:7380 \
cargo run --features data-node --bin raft_engine -- /tmp/node1 0.0.0.0:7379
# Terminal 3: gateway (UI at http://localhost:9380)
RUST_LOG=info \
GATEWAY_META=http://127.0.0.1:8379 \
GATEWAY_JWT_SECRET=mysecret \
GATEWAY_USERS="admin:admin:admin" \
cargo run --bin gatewayOpen http://localhost:9380 — login with admin / admin.
In the KV Console, run:
CREATECF metrics
CREATECF logs
CREATECF embeddings
Each command allocates a new CF in the metadata Raft group, and the data node automatically opens a dedicated LSM shard for it.
PUT hello world
GET hello → "world"
PUT user:alice {"age":30,"city":"Berlin"}
GET user:alice → "{\"age\":30,\"city\":\"Berlin\"}"
GET user:nobody → (nil)
PUT counter 10
MERGE counter 5
MERGE counter 3
DELETE hello
GET hello → (nil)
Via the Redis-compatible interface (once gateway is running on port 6379):
# Basic TTL
redis-cli SET session:abc token123 EX 3600 # expires in 1 hour
redis-cli TTL session:abc # → 3599 (remaining seconds)
redis-cli PTTL session:abc # → remaining milliseconds
# Conditional set (NX = only if not exists)
redis-cli SET lock:foo 1 NX EX 30
# Inspect / remove expiry
redis-cli PERSIST session:abc # → 1 (expiry removed)
redis-cli TTL session:abc # → -1 (no expiry)The gateway ships a built-in web management console at GATEWAY_MGMT_ADDR (default 0.0.0.0:9380).
| Page | Description |
|---|---|
| Dashboard | Live stat cards (status, nodes, shards, column families), per-node shard distribution bars |
| KV Console | Terminal-style REPL — GET, PUT, DELETE, MERGE, CREATECF, DROPCF with command history (↑/↓) and CF selector |
| Users | Create, list, and delete gateway users; role assignment (admin / user) |
| Cluster | SVG network topology graph, node table, shard table, one-click rebalancing |
| Metrics | Parsed Prometheus key metrics (RPC requests/errors, cache hits, WAL bytes) + raw scrape output |
cd src/ui
npm install
npm run dev # Dev server on :5173, proxies /api → :9380
npm run build # Outputs to src/ui/dist/After npm run build, the gateway serves the compiled bundle automatically.
| Variable | Default | Description |
|---|---|---|
GATEWAY_ADDR |
0.0.0.0:9379 |
gRPC listen address |
GATEWAY_META |
http://127.0.0.1:8379 |
Metadata service URL |
GATEWAY_MGMT_ADDR |
0.0.0.0:9380 |
Management HTTP listen address |
GATEWAY_UI_DIR |
src/ui/dist |
Directory containing the built React app |
GATEWAY_JWT_SECRET |
(unset — auth disabled) | HMAC secret for JWT signing |
GATEWAY_USERS |
admin:admin:admin |
Comma-separated user:pass:role seed list |
ArqonDB can be installed as a set of macOS background services that auto-start on login. This uses launchd (the native macOS service manager) to run the metadata server, data node, and gateway as persistent services.
- Build the release binaries (or use
--no-buildif you already have them):
cargo build --release --features data-node- Install
redis-clifor testing. Homebrew does not offer a standaloneredis-cli— install the full Redis package:
brew install redisYou do not need to start the Redis server. Only
redis-cli(included in the package) is used as a client to connect to ArqonDB.
# Full install (builds + installs + starts services)
sudo bash scripts/launchd-install.sh
# If you already built release binaries, skip the build step
sudo bash scripts/launchd-install.sh --no-buildThis will:
- Copy release binaries to
/usr/local/bin/arqondb-{metadata,datanode,gateway} - Create data directories at
/usr/local/var/arqondb/ - Create log directories at
/usr/local/var/log/arqondb/ - Install launchd plist files to
~/Library/LaunchAgents/ - Start all three services in order (metadata → datanode → gateway)
Once the services are running, connect with redis-cli:
redis-cli -h 127.0.0.1 -p 6379
# Try some commands
127.0.0.1:6379> SET hello world
OK
127.0.0.1:6379> GET hello
"world"
127.0.0.1:6379> DEL hello
(integer) 1Troubleshooting: If you see
Could not connect to Redis at 127.0.0.1:6379: Connection refused, the gateway may not be running yet. Check status withlaunchctl list | grep arqondband inspect logs withtail -f /usr/local/var/log/arqondb/*.log.
# Check status
launchctl list | grep arqondb
# View logs
tail -f /usr/local/var/log/arqondb/*.log
# Stop a service
launchctl bootout gui/$(id -u)/com.arqondb.gateway
# Start a service
launchctl bootstrap gui/$(id -u) ~/Library/LaunchAgents/com.arqondb.gateway.plist
# Deploy new code (rebuild + rolling restart)
sudo bash scripts/deploy-local.sh
# Uninstall everything
sudo bash scripts/launchd-uninstall.sh
# Uninstall and remove data
sudo bash scripts/launchd-uninstall.sh --clean# Start metadata (single node)
cargo run --bin metadata_service
# Start three data nodes
for i in 1 2 3; do
META_SERVER=http://127.0.0.1:8379 \
DATA_NODE_ID=$i \
DATA_ADDR=http://127.0.0.1:$((7379 + (i-1)*10)) \
RAFT_ADDR=127.0.0.1:$((7380 + (i-1)*10)) \
cargo run --features data-node --bin raft_engine -- /tmp/node$i 0.0.0.0:$((7379 + (i-1)*10)) &
done
# Start gateway
GATEWAY_META=http://127.0.0.1:8379 cargo run --bin gateway- MemTable: concurrent-safe skip list with MVCC key ordering (
(user_key ASC, seq DESC, type DESC)) - WAL: record-framed with hardware-accelerated CRC32C, supports fragmentation for large writes; background sync thread for durability
- SST: data blocks with prefix compression and configurable restart interval; bloom filters per block range; sharded LRU block cache
- Compaction: leveled compaction, version-set tracks live files and sequence numbers
RaftCoreruns in a singletokio::spawntask — no lock contention on consensus state- All messages (proposals, peer RPCs, timer ticks) pass through an
mpsc::UnboundedSender<RaftMsg> - In-flight proposals tracked by log index in
pending: HashMap<u64, oneshot::Sender<ProposeResult>> - Heartbeat: 50ms; election timeout: 150–300ms (randomized)
- Separate Raft group manages shard map, column family registry, and node membership
- Data nodes subscribe to
SubscribeShardEventsstream; gateway caches the shard map locally MetadataProvidertrait abstracts local (embedded) vs remote (gRPC) metadata
- Client → Gateway (shard lookup via metadata)
- Gateway → Shard leader's
data_addr(gRPC) GrpcShardKvService→ShardEngine::put/get/delete/merge→ routes to the correct shard'sKvServiceKvService::write→RaftNode::propose(WriteBatch::encode_to_bytes())- Raft commits → state machine applies → WAL flush → MemTable insert
- Compaction: leveled compaction with k-way merge, tombstone GC, merge operator support, TTL-aware expiry, and write-amplification optimization (post-delete skip at base level)
- Vector index: HNSW index per node for ANN search — L2, Cosine, Inner Product metrics; PQ compression for memory-efficient large-scale search; distributed search via gateway with index-aware fan-out and top-k merge
- Snapshot & restore: Raft InstallSnapshot for catching up lagging replicas
- Merge operator: wire through ShardEngine so MERGE reads finalize correctly
- Benchmarks: write/read throughput vs RocksDB baseline — ArqonDB outperforms RocksDB on all single-node benchmarks including flush-to-SST workloads
- Client SDKs: gRPC client libraries for Python, Java, Rust, Go, and C++
- CLI client:
arqondb-clifor interactive get/put/scan - Control Plane UI: React 18 management console with KV console, cluster topology, metrics
- ShardEngine gRPC server: data nodes expose ArqonDb gRPC endpoint for gateway routing
- Redis protocol: RESP2-compatible TCP server — connect any Redis client directly to ArqonDB
- TTL / expiry:
EXPIRE,PEXPIRE,EXPIREAT,PEXPIREAT,TTL,PTTL,PERSIST;SET EX/PX/EXAT/PXAT/NX/XX/GEToptions
Good first issues:
-
Implement
DBImpl::iterator(currentlytodo!()) -
Add
DBImpl::flushandcompact_rangeimplementations -
Write benchmarks comparing MemTable throughput to a BTreeMap baseline
Deeper contributions:
- Add IVF (Inverted File Index) pre-filtering to PQ-HNSW for billion-scale search
How to contribute:
- Fork the repo and create a branch
- Make your changes with tests
- Run
cargo testandcargo clippy - Open a PR describing what you changed and why
Apache 2.0 — see LICENSE.
