Skip to content

Conversation

@wnsah814
Copy link
Contributor

@wnsah814 wnsah814 commented Jun 9, 2025

Problem Statement

Currently, BBSSD does not properly handle TRIM operations, which can significantly impact performance. Without proper TRIM support, the garbage collection (GC) mechanism cannot efficiently identify invalidated pages, leading to suboptimal storage management and degraded performance in write-intensive workloads.

Experimental Environment

Host System

  • CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (16 cores, 32 threads)
  • Memory: 503GB DDR4
  • OS: Ubuntu 20.04.6 LTS (Kernel 5.15.6)

FEMU Virtual Environment

  • Guest OS: Ubuntu 20.04.6 LTS (Kernel 5.15.0)
  • Default BBSSD parameters

Performance Impact Analysis

I conducted experiments using fxmark to demonstrate the performance impact:

Experimental Setup

  • Used dwal workload to fill the storage with data (creating root/ directory)
  • Deleted the generated data to trigger invalidation (TRIM operations)
  • Ran two follow-up experiments:
    (1) Re-ran dwal experiment
    (2) Ran dwol experiment

Performance Results

Experiment (1) - DWAL after TRIM:

  • Without TRIM: 98.53s, 3065729 works, 31,113 works/sec
  • With TRIM: 46.30s, 3065734 works, 66,211 works/sec
    → 2.13x performance improvement

Experiment (2) - DWOL after TRIM:

  • Without TRIM: 300.02s, 15687971 works, 52,288 works/sec
  • With TRIM: 300.00s, 21408185 works, 71,360 works/sec
    → 1.36x performance improvement

These results clearly demonstrate that proper TRIM support enables more efficient garbage collection by accurately reflecting invalidated pages in the mapping table.

Implementation Details

Key Features

  • Complete DSM TRIM functionality: Handles multiple ranges per command (up to 256 ranges per NVMe spec)
  • FTL integration: Added ssd_trim() function that properly invalidates pages and updates mapping tables
  • Range validation: Comprehensive boundary checking and error handling
  • Memory management: Proper allocation/deallocation with cleanup on failures

Code Changes

  • ftl.c: Added ssd_trim() function to handle deallocate operations at FTL level
  • nvme-io.c: Enhanced nvme_dsm() command processing with proper validation
  • nvme.h: Extended NvmeRequest structure to support multiple DSM ranges

Questions for Review

  • Multi-range handling: I modified the NvmeRequest structure to pass multiple DSM ranges from nvme_dsm to ftl_thread. The original implementation seemed to only handle single ranges. Is this approach acceptable, or is there a better way to handle multiple ranges?
  • Debug output: Many debug statements are currently commented out. Should I clean these up before merging, or would you prefer to keep them for debugging purposes?

Next Steps

I'd appreciate feedback on the implementation approach, particularly regarding the multi-range handling mechanism. Once approved, I can clean up the commented debug code and address any additional concerns.

wnsah814 added 2 commits June 9, 2025 16:50
- Replace free() with g_free() to match g_malloc0() allocation.
- Implement NVMe DSM TRIM functionality with FTL integration,
range validation, and proper error handling.
@huaicheng huaicheng requested a review from Copilot July 15, 2025 04:24
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for NVMe TRIM (deallocate) operations to the BBSSD FTL path, enabling proper invalidation of logical blocks and improved garbage collection.

  • Introduces DSM range fields in NvmeRequest to carry multi-range TRIM info
  • Enhances nvme-io.c to parse, validate, and enqueue DSM commands with cleanup on failure
  • Implements ssd_trim() in ftl.c and integrates it into the FTL thread dispatch

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
hw/femu/nvme.h Added dsm_ranges, dsm_nr_ranges, dsm_attributes to NvmeRequest
hw/femu/nvme-io.c Initialized DSM fields, parsed CDWs into ranges, validated and forwarded to FTL, improved error cleanup
hw/femu/bbssd/ftl.c Added ssd_trim() to process multi-range TRIMs, wired into FTL thread loop
Comments suppressed due to low confidence (1)

hw/femu/bbssd/ftl.c:861

  • [nitpick] The new TRIM path exercises multi-range parsing, boundary checks, and bitmap updates. Consider adding unit or integration tests to cover valid trims, out-of-bounds ranges, and empty requests to ensure correctness under diverse conditions.
static uint64_t ssd_trim(struct ssd *ssd, NvmeRequest *req)

case NVME_CMD_READ:
lat = ssd_read(ssd, req);
break;
case NVME_CMD_DSM:
Copy link

Copilot AI Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latency variable (lat) is not initialized for the DSM case when req->dsm_ranges is empty, leading to undefined behavior. Consider setting lat = 0; before the conditional or restoring the default assignment.

Copilot uses AI. Check for mistakes.

// If no deallocate attribute, no need to process further for TRIM
if (!has_deallocate) {
femu_err("DSM: No deallocate attribute set, skipping\n");
Copy link

Copilot AI Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Skipping DSM processing when no deallocate attribute is expected; logging this as an error may confuse readers. Use a debug-level log and ensure req->cmd_opcode is set before early returns to keep request routing consistent.

Suggested change
femu_err("DSM: No deallocate attribute set, skipping\n");
femu_debug("DSM: No deallocate attribute set, skipping\n");
req->cmd_opcode = cmd->opcode;

Copilot uses AI. Check for mistakes.
int total_out_of_bounds = 0;

if (!ranges || nr_ranges <= 0) {
printf("TRIM: Invalid ranges or count\n");
Copy link

Copilot AI Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Using printf in the FTL path is inconsistent with existing ftl_err/ftl_debug logging macros. Consider switching to the project’s logging API for consistency and log filtering.

Suggested change
printf("TRIM: Invalid ranges or count\n");
ftl_err("TRIM: Invalid ranges or count\n");

Copilot uses AI. Check for mistakes.
Comment on lines +868 to +870
int total_trimmed_pages = 0;
int total_already_invalid = 0;
int total_out_of_bounds = 0;
Copy link

Copilot AI Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The counters total_trimmed_pages, total_already_invalid, and total_out_of_bounds are never used or logged. Remove them or emit a summary log to reduce dead code and improve observability.

Suggested change
int total_trimmed_pages = 0;
int total_already_invalid = 0;
int total_out_of_bounds = 0;
// Removed unused counters: total_trimmed_pages, total_already_invalid, total_out_of_bounds

Copilot uses AI. Check for mistakes.
@huaicheng huaicheng merged commit d749928 into MoatLab:master Jul 15, 2025
@huaicheng
Copy link
Contributor

Looks good to me! That said, we will need to run more thorough testing. I'm merging this into the master branch now, and we can follow up with additional fixes as needed.

Thanks so much for this patch. It’s been a long time coming, and I’m glad to see it finally addressed! 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants