Skip to content

Integrate loriot-websocket-mcp data persistence patterns #33

@avrabe

Description

@avrabe

Summary

Integrate the sophisticated data persistence patterns from loriot-websocket-mcp into the main framework, providing configurable long-term storage with two-tier memory management.

Background

The loriot-websocket-mcp project implements several advanced persistence strategies that would benefit the broader MCP ecosystem:

  • JSON Lines format for efficient append-only storage
  • Two-tier memory management with configurable in-memory cache + file backup
  • Automatic deduplication using timestamp + device identifier patterns
  • Configurable retention policies with backup rotation
  • Startup data loading from existing storage files

These patterns are particularly valuable for MCP servers handling time-series data, sensor readings, or any scenario requiring long-term data retention with efficient access.

Implementation Tasks

Core Persistence Infrastructure

  • Create pulseengine-mcp-persistence crate
  • Extract and generalize persistence patterns from loriot-websocket-mcp
  • Implement trait-based abstractions for different storage formats
  • Add configurable retention and archival policies

JSON Lines Storage Implementation

  • Implement efficient JSON Lines writer with buffering
  • Add atomic append operations with crash recovery
  • Create indexed access for query operations
  • Implement compression for archived data

Two-Tier Memory Management

  • Design configurable in-memory cache layer
  • Implement LRU eviction with configurable limits
  • Add background synchronization to persistent storage
  • Create memory pressure handling and adaptive sizing

Data Deduplication

  • Implement configurable deduplication strategies:
    • Timestamp-based deduplication
    • Content hash-based deduplication
    • Custom key-based deduplication
  • Add bloom filters for efficient duplicate detection
  • Create deduplication metrics and monitoring

Configuration System

#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct PersistenceConfig {
    /// Storage file path
    pub file_path: String,
    /// Maximum readings kept in memory
    pub memory_limit_readings: usize,
    /// Number of backup files to maintain
    pub backup_count: u32,
    /// Sync interval for background persistence
    pub sync_interval_minutes: u64,
    /// Deduplication strategy
    pub deduplication: DeduplicationStrategy,
    /// Compression settings
    pub compression: CompressionConfig,
    /// Retention policy
    pub retention: RetentionPolicy,
}

Integration with Storage Backend Abstraction

Query and Retrieval Interface

  • Design time-range query interface
  • Implement efficient data filtering and aggregation
  • Add streaming query results for large datasets
  • Create pagination support for web interfaces

Backup and Recovery

  • Implement incremental backup strategies
  • Add data integrity verification
  • Create recovery tools for corrupted files
  • Implement backup compression and encryption

WASM Compatibility

  • Adapt file I/O operations for WASI filesystem interfaces
  • Implement memory-only mode for pure WASM environments
  • Add component-model persistence interfaces
  • Create browser-compatible storage using IndexedDB

Integration Points

MCP Server Framework Integration

  • Add persistence middleware for automatic data collection
  • Create configurable persistence policies per resource type
  • Implement resource versioning and history tracking
  • Add metrics collection for persistence operations

Caching Framework Integration (#28)

  • Bridge persistence layer with generalized caching
  • Implement cache-backed persistence for hot data
  • Add cache warming from persistent storage
  • Create unified memory management across cache and persistence

Monitoring and Metrics

  • Add persistence performance metrics
  • Implement storage utilization monitoring
  • Create data quality and integrity checks
  • Add alerting for storage issues

Example Usage Patterns

Sensor Data Collection

let persistence = PersistenceManager::new(PersistenceConfig {
    file_path: "sensor_data.jsonl".into(),
    memory_limit_readings: 5000,
    backup_count: 7,
    sync_interval_minutes: 15,
    deduplication: DeduplicationStrategy::TimestampAndSource,
    retention: RetentionPolicy::KeepDays(90),
});

// Automatic persistence of MCP resource updates
persistence.record_resource_update(resource_uri, data, timestamp).await?;

// Query historical data
let history = persistence.query_range(
    resource_uri,
    start_time..end_time,
    QueryOptions::default()
).await?;

Chat/Conversation History

let config = PersistenceConfig {
    deduplication: DeduplicationStrategy::ContentHash,
    retention: RetentionPolicy::KeepCount(10000),
    compression: CompressionConfig::Gzip { level: 6 },
    ..Default::default()
};

Acceptance Criteria

  • Efficient JSON Lines storage with configurable retention
  • Two-tier memory management with adaptive sizing
  • Multiple deduplication strategies implemented
  • WASM-compatible persistence implementations
  • Integration with existing storage backend abstraction
  • Comprehensive test suite including performance benchmarks
  • Migration tools from loriot-websocket-mcp data format

Related Issues

References

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions