Skip to content

[refactor] Semantic Function Clustering Analysis: Duplicate and Outlier Functions #28710

@github-actions

Description

@github-actions

Analysis of the github/gh-aw repository — 727 non-test Go source files across 22 packages in pkg/. Findings below are ordered by estimated impact.

Executive Summary

  • Total files analyzed: 727 non-test Go files
  • Packages analyzed: 22 (pkg/workflow 352 files, pkg/cli 264 files, 20 utility packages)
  • Exact duplicates found: 1 pair (identical function bodies)
  • Redundant overrides found: 2 (base class already provides identical implementation)
  • Refactoring opportunities identified: 3

Issue 1: Exact Duplicate Functions — extractOpenCodeProviderFromModel / extractCrushProviderFromModel

File: pkg/workflow/domains.go
Severity: High — identical implementations, clear consolidation opportunity

Both functions at lines 192–201 and 226–235 have byte-for-byte identical bodies:

// Line 192 — for OpenCode
func extractOpenCodeProviderFromModel(model string) string {
    if model == "" {
        return "copilot"
    }
    parts := strings.SplitN(model, "/", 2)
    if len(parts) < 2 {
        return "copilot"
    }
    return strings.ToLower(parts[0])
}

// Line 226 — for Crush (identical)
func extractCrushProviderFromModel(model string) string {
    if model == "" {
        return "copilot"
    }
    parts := strings.SplitN(model, "/", 2)
    if len(parts) < 2 {
        return "copilot"
    }
    return strings.ToLower(parts[0])
}

Both parse the "provider/model" format and return the lowercase provider prefix with "copilot" as the default. The same parsing logic also appears in resolveUniversalLLMBackendFromModel in pkg/workflow/universal_llm_consumer_engine.go (though that function has richer error handling and a backend lookup).

Recommendation: Consolidate into a single unexported helper:

// extractProviderFromModel parses "provider/model" format and returns the
// lowercase provider prefix; returns "copilot" when format is absent.
func extractProviderFromModel(model string) string {
    if model == "" {
        return "copilot"
    }
    parts := strings.SplitN(model, "/", 2)
    if len(parts) < 2 {
        return "copilot"
    }
    return strings.ToLower(parts[0])
}

Then replace both call sites in GetOpenCodeDefaultDomains and GetCrushDefaultDomains.

Estimated effort: ~15 minutes
Benefit: Single source of truth for provider extraction logic; reduces risk of the two diverging silently


Issue 2: Redundant Method Overrides — GetFirewallLogsCollectionStep on Claude and Codex Engines

Files: pkg/workflow/claude_engine.go:468–471, pkg/workflow/codex_engine.go:388–390
Severity: Medium — code noise, misleading comments

BaseEngine.GetFirewallLogsCollectionStep in pkg/workflow/agentic_engine.go:382–384 already returns []GitHubActionStep{}:

func (e *BaseEngine) GetFirewallLogsCollectionStep(workflowData *WorkflowData) []GitHubActionStep {
    return []GitHubActionStep{}
}

Both ClaudeEngine and CodexEngine override this with the exact same body, adding no new behavior:

// claude_engine.go — identical to BaseEngine
func (e *ClaudeEngine) GetFirewallLogsCollectionStep(workflowData *WorkflowData) []GitHubActionStep {
    return []GitHubActionStep{}
}

// codex_engine.go — identical to BaseEngine
func (e *CodexEngine) GetFirewallLogsCollectionStep(workflowData *WorkflowData) []GitHubActionStep {
    return []GitHubActionStep{}
}

The comments on these overrides describe why the step is empty ("firewall logs are now at a known location"), which is useful context — but that context belongs in the base implementation's comment, not in each engine that happens to share the same behavior.

Recommendation: Delete both override methods. Move the rationale comment to BaseEngine.GetFirewallLogsCollectionStep. Engines that genuinely override this (e.g., CopilotEngine) are unaffected.

Estimated effort: ~10 minutes
Benefit: Removes misleading suggestion that these engines have special firewall collection behavior; reduces ~20 lines of dead code


Issue 3: Thin Wrapper Duplication — Config Parsing Helpers

Files: pkg/workflow/config_helpers.go and pkg/workflow/safe_outputs_parser.go
Severity: Low — acceptable but worth noting

Both files contain thin wrapper functions over the same underlying calls with only the config key name differing:

config_helpers.go safe_outputs_parser.go
parseLabelsFromConfig → key "labels" parseRequiredLabelsFromConfig → key "required-labels"
parseTitlePrefixFromConfig → key "title-prefix" parseRequiredTitlePrefixFromConfig → key "required-title-prefix"

These serve distinct configuration namespaces (safe-output filtering vs. general config), so they are not exact duplicates. However, each function is a one-liner wrapper whose name and key differ in parallel. If the set of such wrappers grows further, introducing a typed config key constant pattern would prevent silent key typos.

Recommendation: No immediate change required. If additional parse*FromConfig wrappers are added, consider extracting config key constants to a shared file to make the key names discoverable.


Function Cluster Summary

Well-organized clusters (no action needed)

Engine Implementations (Correct Polymorphism)

Each engine (claude, codex, gemini, copilot, crush, opencode) has:

  • <engine>_engine.go — core interface implementation
  • <engine>_mcp.go — MCP config rendering
  • <engine>_logs.go — log metric parsing

This is correct interface-based design. The shared BaseEngine default implementations and the defaultGetSquidLogsSteps shared helper demonstrate that common logic is already extracted.

Entity Operation Files (Good Decomposition)

  • update_issue_helpers.go, update_discussion_helpers.go, update_pull_request_helpers.go — each wraps a single parseUpdate*Config method using shared UpdateEntityConfig and UpdateEntityJobParams types
  • close_entity_helpers.go — correctly consolidates all close-entity logic in one file

Utility Package Separation (Correct)

  • pkg/stringutil/ — general-purpose string utilities (Truncate, NormalizeWhitespace, etc.)
  • pkg/workflow/strings.go — workflow-domain string utilities (SanitizeWorkflowName, GenerateHeredocDelimiterFromSeed, etc.)

These are appropriately separated by domain, not duplicated.


Implementation Checklist

  • Issue 1 — Consolidate extractOpenCodeProviderFromModel and extractCrushProviderFromModel into extractProviderFromModel in pkg/workflow/domains.go
  • Issue 2 — Delete ClaudeEngine.GetFirewallLogsCollectionStep (pkg/workflow/claude_engine.go:468–471) and CodexEngine.GetFirewallLogsCollectionStep (pkg/workflow/codex_engine.go:388–390); move rationale to BaseEngine
  • Issue 3 — Track as low-priority; consider config key constants if more wrappers are added

Analysis Metadata

Property Value
Total Go files analyzed 727
Primary packages pkg/workflow (352), pkg/cli (264)
Analysis method Serena LSP semantic analysis + naming pattern analysis
Exact duplicates 1 pair
Redundant overrides 2 methods
Analysis date 2026-04-27
Workflow run §24993297384

Generated by Semantic Function Refactoring · ● 674.4K ·

  • expires on Apr 29, 2026, 12:02 PM UTC

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions