Skip to content

Add Cursor rules#1294

Merged
sarahyurick merged 11 commits intoNVIDIA-NeMo:mainfrom
sarahyurick:cursor_rules
Dec 16, 2025
Merged

Add Cursor rules#1294
sarahyurick merged 11 commits intoNVIDIA-NeMo:mainfrom
sarahyurick:cursor_rules

Conversation

@sarahyurick
Copy link
Copy Markdown
Contributor

No description provided.

Signed-off-by: Sarah Yurick <sarahyurick@gmail.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Dec 12, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

sarahyurick and others added 4 commits December 12, 2025 10:30
Signed-off-by: Sarah Yurick <sarahyurick@gmail.com>
Signed-off-by: Sarah Yurick <sarahyurick@gmail.com>
Signed-off-by: Sarah Yurick <sarahyurick@gmail.com>
@sarahyurick sarahyurick marked this pull request as ready for review December 12, 2025 20:17
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Dec 12, 2025

Greptile Overview

Greptile Summary

This PR adds comprehensive Cursor IDE rules documentation to guide AI-assisted development in the NeMo Curator codebase.

Changes:

  • Added 8 .mdc files in .cursor/rules/ covering coding standards, architecture patterns, and API documentation
  • Documents coding standards including Ruff linting rules, copyright headers, Python version support (3.10-3.12), and 80% test coverage requirement
  • Provides detailed architecture guidance for ProcessingStage, CompositeStage, Pipeline, and Executor patterns
  • Includes modality-specific structure documentation for text, image, audio, and video processing
  • Documents Resources configuration with GPU memory allocation constraints
  • Covers Task patterns including DocumentBatch, VideoTask, ImageBatch, AudioBatch, and EmptyTask

Validation:
All documentation was cross-referenced against actual implementation in nemo_curator/stages/base.py, nemo_curator/stages/resources.py, nemo_curator/pipeline/pipeline.py, nemo_curator/backends/xenna/executor.py, and pyproject.toml. The documentation accurately reflects the codebase structure, API signatures, constraints, and coding standards.

Confidence Score: 5/5

  • This PR is completely safe to merge - it only adds documentation files with no code changes
  • Perfect score because this PR adds only documentation files (.mdc Cursor rules) with zero code changes. All documented patterns were verified against actual implementation and found to be accurate. No runtime impact, no security concerns, no breaking changes.
  • No files require special attention - all documentation is accurate and well-structured

Important Files Changed

File Analysis

Filename Score Overview
.cursor/rules/coding-standards.mdc 5/5 Comprehensive documentation of Ruff linting rules, copyright headers, Python version support, and testing requirements - all accurate
.cursor/rules/composite-stage-patterns.mdc 5/5 Clear documentation of CompositeStage pattern, decomposition, and with_() configuration - matches implementation
.cursor/rules/executors.mdc 5/5 Detailed executor documentation covering XennaExecutor, experimental executors, and BaseExecutor interface - technically accurate
.cursor/rules/modality-structure.mdc 5/5 Accurate documentation of four modalities (text, image, audio, video) with correct directory structure and task types
.cursor/rules/pipeline-structure.mdc 5/5 Pipeline creation, stage addition, running, and composite stage decomposition documented correctly
.cursor/rules/processing-stage-patterns.mdc 5/5 Thorough documentation of ProcessingStage pattern including type parameters, properties, inputs/outputs, and lifecycle methods
.cursor/rules/resources-configuration.mdc 5/5 Resources class documentation with accurate constraints about gpu_memory_gb vs gpus mutual exclusivity
.cursor/rules/task-patterns.mdc 5/5 Task base class and common task types documented with accurate attributes and EmptyTask usage patterns

Sequence Diagram

sequenceDiagram
    participant Dev as Developer
    participant Cursor as Cursor IDE
    participant Rules as .cursor/rules/*.mdc
    participant Code as Codebase

    Dev->>Cursor: Request code assistance
    Cursor->>Rules: Load applicable rules
    Rules->>Cursor: coding-standards.mdc<br/>(Ruff rules, copyright, tests)
    Rules->>Cursor: processing-stage-patterns.mdc<br/>(Stage API, inputs/outputs)
    Rules->>Cursor: resources-configuration.mdc<br/>(GPU/CPU allocation)
    Rules->>Cursor: task-patterns.mdc<br/>(Task types, attributes)
    
    Cursor->>Code: Analyze existing implementation
    Code->>Cursor: Current patterns & structure
    
    Cursor->>Dev: Generate code suggestion<br/>(following documented patterns)
    Dev->>Code: Accept & implement
    
    Note over Dev,Code: For composite stages
    Dev->>Cursor: Request composite stage help
    Cursor->>Rules: composite-stage-patterns.mdc<br/>(decompose(), with_())
    Cursor->>Dev: Generate CompositeStage code
    
    Note over Dev,Code: For pipelines
    Dev->>Cursor: Request pipeline help
    Cursor->>Rules: pipeline-structure.mdc<br/>(add_stage, build, run)
    Cursor->>Rules: executors.mdc<br/>(XennaExecutor config)
    Cursor->>Dev: Generate Pipeline code
    
    Note over Dev,Code: For modality-specific work
    Dev->>Cursor: Request text/image/audio/video stage
    Cursor->>Rules: modality-structure.mdc<br/>(directory structure, task types)
    Cursor->>Dev: Generate modality-specific code
Loading

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Comments (2)

  1. .cursor/rules/processing-stage-patterns.mdc, line 122 (link)

    syntax: Variable name inconsistency: using batch when the parameter is task: Task

  2. .cursor/rules/processing-stage-patterns.mdc, line 123-126 (link)

    syntax: All references use batch but should be task to match the parameter name

8 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Signed-off-by: Sarah Yurick <sarahyurick@gmail.com>
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Comment thread .cursor/rules/resources-configuration.mdc Outdated
Signed-off-by: Sarah Yurick <53962159+sarahyurick@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Comments (2)

  1. .cursor/rules/modality-structure.mdc, line 56 (link)

    style: Remove TODO comment before merging

  2. .cursor/rules/resources-configuration.mdc, line 30-34 (link)

    style: Missing resource attributes that exist in actual implementation: entire_gpu (bool), nvdecs (int), and nvencs (int). See nemo_curator/stages/resources.py:48-51

8 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment thread .cursor/rules/modality-structure.mdc Outdated
Signed-off-by: Sarah Yurick <53962159+sarahyurick@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Comments (1)

  1. .cursor/rules/resources-configuration.mdc, line 30-34 (link)

    style: The actual Resources class has additional attributes not documented here: nvdecs (int), nvencs (int), and entire_gpu (bool). Consider adding these to keep the documentation complete.

8 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Copy link
Copy Markdown
Contributor

@VibhuJawa VibhuJawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I will have to play more with them more.

Minor nit to make it dataclass

Comment thread .cursor/rules/composite-stage-patterns.mdc
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Comments (1)

  1. .cursor/rules/resources-configuration.mdc, line 30-34 (link)

    logic: Incomplete resource attributes. The actual Resources class also includes:

    • nvdecs: Number of NVDEC units required
    • nvencs: Number of NVENC units required
    • entire_gpu: Whether to allocate entire GPU regardless of memory

8 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@sarahyurick sarahyurick merged commit a217590 into NVIDIA-NeMo:main Dec 16, 2025
12 checks passed
@sarahyurick sarahyurick deleted the cursor_rules branch February 9, 2026 18:13
copy-pr-bot Bot pushed a commit that referenced this pull request Feb 19, 2026
* add cursor rules for processing stages and tasks

Signed-off-by: Sarah Yurick <sarahyurick@gmail.com>

* add coding standards and modality structure

Signed-off-by: Sarah Yurick <sarahyurick@gmail.com>

* update file

Signed-off-by: Sarah Yurick <sarahyurick@gmail.com>

* add rules for composite stages, executors, pipelines, resources

Signed-off-by: Sarah Yurick <sarahyurick@gmail.com>

* fix inconsistency

Signed-off-by: Sarah Yurick <sarahyurick@gmail.com>

* Update .cursor/rules/resources-configuration.mdc

Signed-off-by: Sarah Yurick <53962159+sarahyurick@users.noreply.github.com>

* Update .cursor/rules/modality-structure.mdc

Signed-off-by: Sarah Yurick <53962159+sarahyurick@users.noreply.github.com>

* Add suggestion

---------

Signed-off-by: Sarah Yurick <sarahyurick@gmail.com>
Signed-off-by: Sarah Yurick <53962159+sarahyurick@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants