Skip to content

[duplicate-code] 🔍 Duplicate Code Detected #1856

@github-actions

Description

@github-actions

🔍 Duplicate Code Detected

Analysis of commit 2d83ebf

Assignee: @copilot

Summary

Duplicate package collection and validation logic landed in pkg/workflow/validation.go. Three helpers repeat the same loops to gather package names, and two validation helpers duplicate the same pip-check workflow. The drift risk is high because these blocks must stay in sync when the workflow schema evolves.

Duplication Details

Pattern 1: Repeated package collection scaffolding

  • Severity: Medium
  • Occurrences: 3
  • Locations:
    • pkg/workflow/validation.go (lines 303-356)
    • pkg/workflow/validation.go (lines 381-414)
    • pkg/workflow/validation.go (lines 449-482)
  • Code Sample:
    func extractNpxPackages(workflowData *WorkflowData) []string {
        var packages []string
        seen := make(map[string]bool)
    
        if workflowData.CustomSteps != "" {
            for _, pkg := range extractNpxFromCommands(workflowData.CustomSteps) {
                if !seen[pkg] {
                    packages = append(packages, pkg)
                    seen[pkg] = true
                }
            }
        }
    
        if workflowData.EngineConfig != nil && len(workflowData.EngineConfig.Steps) > 0 {
            for _, step := range workflowData.EngineConfig.Steps {
                if run, hasRun := step["run"]; hasRun {
                    if runStr, ok := run.(string); ok {
                        for _, pkg := range extractNpxFromCommands(runStr) {
                            if !seen[pkg] {
                                packages = append(packages, pkg)
                                seen[pkg] = true
                            }
                        }
                    }
                }
            }
        }
    }
    The same scaffolding reappears in extractPipPackages and extractUvPackages, differing only by the command parsing helper. Any change to the workflow structure has to be replicated in all three blocks.

Pattern 2: Pip-backed package validation flow duplicated

  • Severity: Medium
  • Occurrences: 2
  • Locations:
    • pkg/workflow/validation.go (lines 182-223)
    • pkg/workflow/validation.go (lines 273-299)
  • Code Sample:
    for _, pkg := range packages {
        cmd := exec.Command(pipCmd, "index", "versions", pkgName)
        output, err := cmd.CombinedOutput()
    
        if err != nil {
            outputStr := strings.TrimSpace(string(output))
            fmt.Fprintln(os.Stderr, console.FormatWarningMessage(fmt.Sprintf("pip package '%s' validation failed - skipping verification. Package may or may not exist on PyPI.", pkg)))
            if c.verbose {
                fmt.Fprintln(os.Stderr, console.FormatWarningMessage(fmt.Sprintf("  Details: %s", outputStr)))
            }
        } else if c.verbose {
            fmt.Fprintln(os.Stderr, console.FormatInfoMessage(fmt.Sprintf("✓ pip package validated: %s", pkg)))
        }
    }
    The validatePipPackages and validateUvPackagesWithPip loops share the same structure, logging, and warning messaging with only naming differences.

Impact Analysis

  • Maintainability: Keeping the three collection helpers and two validation helpers in sync invites drift the next time workflow data gains new sources or logging changes.
  • Bug Risk: Fixes (e.g., additional command sources or logging tweaks) must be applied in multiple places, increasing the chance of inconsistent behavior.
  • Code Bloat: Extra copies add ~90 lines of duplicated logic inside a single file.

Refactoring Recommendations

  1. Factor shared collectors

    • Extract a generic helper such as collectPackages(workflowData, extractor, includeTools bool) and pass in the command parser (e.g., extractNpxFromCommands).
    • Estimated effort: 2-3 hours to implement and update call sites.
    • Benefits: Single point to update when workflow inputs or dedup logic evolve.
  2. Unify pip validation loop

    • Introduce a reusable validatePythonPackages(packages []string, cmdName string) that handles the common loop and logging, with callers supplying the command label.
    • Estimated effort: 1-2 hours.
    • Benefits: Centralized messaging and reduced chance of pip/uv divergence.

Implementation Checklist

  • Review duplication findings
  • Prioritize refactoring tasks
  • Create refactoring plan
  • Implement changes
  • Update tests
  • Verify no functionality broken

Analysis Metadata

  • Analyzed Files: 1
  • Detection Method: Serena semantic code analysis
  • Commit: 2d83ebf
  • Analysis Date: 2025-10-17T11:10:01Z

AI generated by Duplicate Code Detector

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions