In-Memory, Append-Only, Single-Table Database Implementation by proXDhiya · Pull Request #3 · proXDhiya/simple-database-c

proXDhiya · 2025-02-02T18:41:31Z

In-Memory, Append-Only, Single-Table Database Implementation

This document describes the architecture and implementation of a simple in-memory database with the following characteristics:

Append-only writes (no updates/deletes)
Single hard-coded table structure
Memory-only storage (no disk persistence)
Fixed schema with three columns

Key Features

1. Table Schema

Stores user data with fixed columns:

column     type            size
---------------------------------
id        integer          4 bytes
username  varchar(32)     32 bytes
email     varchar(255)   255 bytes

2. Storage Characteristics

Aspect	Specification
Persistence	Volatile (in-memory only)
Max Rows	100 pages × 14 rows/page = 1,400 rows
Page Size	4KB (matches OS page size)
Row Size	291 bytes (4 + 32 + 255)
Allocation Strategy	Demand-paged (pages allocated on use)

Design Implementation

Row Storage Architecture

typedef struct {
  uint32_t id;                  // 4 bytes
  char username[32];            // 32 bytes
  char email[255];              // 255 bytes
} Row;

// Serialized layout
+---------+-----------+-----------+
| id (4)  | username  | email     |
|         | (32)      | (255)     |
+---------+-----------+-----------+

Memory Management

#define PAGE_SIZE 4096                          // 4KB pages
#define TABLE_MAX_PAGES 100                     // Fixed page array size
#define ROWS_PER_PAGE (PAGE_SIZE / ROW_SIZE)    // 14 rows/page

typedef struct {
  uint32_t num_rows;                            // Current row count
  void* pages[TABLE_MAX_PAGES];                 // Page pointers array
} Table;

Key Functions

1. Row Serialization

void serialize_row(Row* source, void* destination) {
  memcpy(destination + 0, &source->id, 4);
  memcpy(destination + 4, source->username, 32);
  memcpy(destination + 36, source->email, 255);
}

2. Memory Access

void* row_slot(Table* table, uint32_t row_num) {
  uint32_t page_num = row_num / 14;
  if (!table->pages[page_num]) {
    table->pages[page_num] = malloc(PAGE_SIZE); // Lazy allocation
  }
  uint32_t byte_offset = (row_num % 14) * 291;
  return table->pages[page_num] + byte_offset;
}

Operation Flow

Insert Operation

Parse input: insert <id> <username> <email>
Validate input syntax
Check table capacity
Serialize row to next available slot
Increment row counter

Select Operation

Iterate through all allocated pages
Deserialize each row
Print formatted output

Limitations

Capacity Limits
- Maximum 1,400 rows (100 pages × 14 rows)
- Fixed column sizes (no variable-length fields)
Functional Constraints
- No persistence between sessions
- No update/delete operations
- No indexing or query optimization
- Single table only
Error Handling
- Basic syntax checking
- Table full detection
- No data validation (e.g., duplicate IDs)

Example Usage

$ ./database
db > insert 1 alice alice@example.com
Executed.
db > insert 2 bob bob@domain.com
Executed.
db > select
(1, alice, alice@example.com)
(2, bob, bob@domain.com)
Executed.
db > insert invalid data
Syntax error. Could not parse statement.
db > .exit

Test Considerations

Critical test scenarios include:

Boundary Conditions
- Inserting exactly 1,400 rows
- Attempting 1,401st insert (table full error)
Data Validation
- Overlength usernames/emails (truncation behavior)
- Non-integer ID values
- Missing/extra insert arguments
Memory Management
- Verify page allocation on first access
- Confirm proper memory release on exit
- Test interleaved insert/select operations

Implementation Notes

Append-Only Design
- Simple write pattern enables high insert speed
- No fragmentation concerns
- Tradeoff: No update/delete capabilities
Serialization Choices
- Fixed offsets enable O(1) row access
- Memory layout matches disk format (future-proofing)
- Padding not required due to exact size matching
Page Allocation Strategy
- Lazy allocation reduces memory overhead
- 4KB pages align with virtual memory systems
- Array-based management simplifies access
Concurrency Model
- Single-threaded execution only
- No locking mechanisms implemented
- Not thread-safe for concurrent access
Error Recovery
- No transaction rollback capability
- Failed operations leave partial state
- Memory integrity maintained through:
  - Atomic page allocations
  - Bounds-checked row access
Data Validation
- Minimal input sanitization:
  - Basic type checking for IDs
  - Automatic string truncation for overflow
- Missing validation for:
  - Duplicate IDs
  - Email format compliance
  - Username character restrictions
Schema Rigidity
- Hard-coded column structure
- Type system limitations:
  - No true VARCHAR implementation
  - Fixed-size character arrays
- Schema changes require recompilation
Performance Characteristics
- O(1) insert complexity (append-only)
- O(n) select complexity (full scan)
- Memory access patterns:
  - Sequential writes (inserts)
  - Random reads (selects)
- No cache optimization implemented
Security Considerations
- Memory-safe operations through:
  - Bounded string copies
  - Page boundary checks
- Vulnerabilities:
  - No input sanitization for special characters
  - Potential for buffer overflows in edge cases
  - Memory contents not zeroed after free

proXDhiya added 2 commits February 2, 2025 19:39

Chore: update

e1a05b3

Feat: In-Memory, Append-Only, Single-Table Database Implementation

d20d32b

proXDhiya self-assigned this Feb 2, 2025

proXDhiya added documentation Improvements or additions to documentation enhancement New feature or request labels Feb 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In-Memory, Append-Only, Single-Table Database Implementation#3

In-Memory, Append-Only, Single-Table Database Implementation#3
proXDhiya wants to merge 2 commits intomainfrom
Feature/Compiler-and-VM

proXDhiya commented Feb 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

proXDhiya commented Feb 2, 2025

In-Memory, Append-Only, Single-Table Database Implementation

Key Features

1. Table Schema

2. Storage Characteristics

Design Implementation

Row Storage Architecture

Memory Management

Key Functions

1. Row Serialization

2. Memory Access

Operation Flow

Insert Operation

Select Operation

Limitations

Example Usage

Test Considerations

Implementation Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant