Skip to content

[refactor] Semantic Function Clustering Analysis: Outliers, Duplicates & Refactoring Opportunities #20709

@github-actions

Description

@github-actions

Automated semantic analysis of 562 non-test Go files across pkg/workflow/ (269 files, 1,377 functions) and pkg/cli/ (203 files, 827 functions). Five high-impact refactoring opportunities were identified.

Analysis Summary

Metric Count
Total Go files analyzed 562
Total functions cataloged ~2,274
Packages analyzed 18
Outlier functions (wrong file) 26
Near-duplicate functions detected 6 clusters
Scattered utility patterns 3 groups

Critical Issues

1. Validation Functions Misplaced in Non-Validation Files

pkg/workflow/ — 11 validation functions live outside *_validation.go files:

File Function Recommendation
action_sha_checker.go ValidateActionSHAsInLockFile Move to new action_sha_validation.go
cache.go validateNoDuplicateCacheIDs Move to new cache_validation.go
github_tool_to_toolset.go ValidateGitHubToolsAgainstToolsets Move to tools_validation.go
imports.go ValidateIncludedPermissions Move to permissions_validation.go
jobs.go ValidateDuplicateSteps, ValidateDependencies Move to new jobs_validation.go
lock_schema.go ValidateLockSchemaCompatibility Move to new lock_validation.go
repo_memory.go validateNoDuplicateMemoryIDs, validateBranchPrefix Move to new repo_memory_validation.go
templatables.go validateStringEnumField Move to validation_helpers.go

pkg/cli/ — 4 validation functions embedded in command/helper files:

File Function Recommendation
actions_build_command.go validateActionYml, ActionsValidateCommand Extract to validation file
compile_compiler_setup.go validateActionModeConfig Move to compile_validation.go
mcp_server_helpers.go validateWorkflowName Clarify vs validators.go:ValidateWorkflowName

2. Near-Duplicate Functions

View Duplicate Function Details

validateNoDuplicateCacheIDs / validateNoDuplicateMemoryIDs

  • pkg/workflow/cache.go:862
  • pkg/workflow/repo_memory.go:510
  • Both implement the identical seen map[string]bool duplicate-ID check pattern. A Go generics helper in validation_helpers.go would unify both:
    func validateNoDuplicateIDs[T interface{ GetID() string }](items []T) error { ... }

parsePRURL (cli wrapper) / ParsePRURL (parser)

  • pkg/cli/pr_command.go:107 — one-liner delegating to parser.ParsePRURL
  • pkg/parser/github_urls.go:331 — canonical implementation
  • The CLI wrapper adds no value; callers can use parser.ParsePRURL directly.

parseGoMod / parseGoModWithIndirect / parseGoModFile (3 wrappers)

  • pkg/cli/deps_helpers.go:34parseGoModFile (canonical)
  • pkg/cli/deps_report.go:275parseGoModWithIndirect (marked "backward compatibility")
  • pkg/cli/deps_outdated.go:139parseGoMod (filters indirect, wraps canonical)
  • The two wrappers should be inlined at their call sites and removed.

validateWorkflowName (two versions)

  • pkg/cli/validators.go:18ValidateWorkflowName (exported, strict)
  • pkg/cli/mcp_server_helpers.go:132validateWorkflowName (allows empty)
  • Different semantics with the same name — the distinction needs documentation or the MCP version should be explicitly named (e.g., validateWorkflowNameOrEmpty).

validateExpressionSyntax / validateSingleExpression

  • pkg/workflow/concurrency_validation.go:131validateExpressionSyntax
  • pkg/workflow/expression_validation.go:241validateSingleExpression
  • Both validate $\{\{ }} expression syntax; belong in the same file.

validateMountsMCPSyntax / validateMountsSyntax

  • pkg/workflow/mcp_config_validation.go:273
  • pkg/workflow/sandbox_validation.go:23
  • Both delegate to validateMountStringFormat; differ only in error context.

3. Scattered Helper Patterns

View Scattered Pattern Details

43 (c *Compiler) parseXxxConfig(outputMap map[string]any) methods across 40+ action files

Every safe-output action file (add_labels.go, create_issue.go, hide_comment.go, etc.) contains exactly one compiler method with an identical func (c *Compiler) parseXxxConfig(outputMap map[string]any) *XxxConfig signature. This is a mass scatter pattern — prime candidate for a registry or code-generated table-driven dispatcher keyed by YAML action name.

GitHub URL / repo-slug parsing split across 5 files in 3 packages:

  • pkg/stringutil/urls.goNormalizeGitHubHostURL, ExtractDomainFromURL
  • pkg/parser/github_urls.goParseGitHubURL, ParsePRURL, ParseRunURLExtended, ParseRepoFileURL
  • pkg/cli/git.goparseGitHubRepoSlugFromURL
  • pkg/cli/spec.goparseGitHubURL, parseWorkflowSpec, parseRepoSpec

pkg/parser/github_urls.go should be the single authority; CLI-local wrappers should delegate to it or be eliminated.

Expression validation primitives misplaced in concurrency_validation.go:

  • validateBalancedBraces (line 78)
  • validateBalancedQuotes (line 214)
  • validateExpressionSyntax (line 131)
  • validateExpressionContent (line 162)

These are expression-level primitives, not concurrency-specific. They belong in expression_validation.go.


Refactoring Recommendations

Priority 1 — High Impact

  1. Consolidate 43 scattered parseXxxConfig methods into a registry/table-driven dispatch in pkg/workflow/

    • Each action file currently owns its own parser method with identical signatures
    • A keyed-dispatch approach removes the implicit one-parse-per-file convention
    • Makes adding new actions trivial and removes boilerplate
  2. Move expression validation primitives to expression_validation.go

    • Move validateExpressionSyntax, validateExpressionContent, validateBalancedBraces, validateBalancedQuotes from concurrency_validation.go
    • Eliminates the near-duplicate with validateSingleExpression

Priority 2 — Medium Impact

  1. Centralize GitHub URL parsing under pkg/parser

    • Eliminate pkg/cli/git.go:parseGitHubRepoSlugFromURL and pkg/cli/spec.go:parseGitHubURL (verify they can delegate to pkg/parser)
    • Removes risk of divergent parsing behavior across CLI entry points
  2. Move misplaced validation functions to their respective _validation.go files

    • 11 functions in pkg/workflow/, 4 in pkg/cli/
    • Creates missing cache_validation.go, jobs_validation.go, lock_validation.go, repo_memory_validation.go

Priority 3 — Low Effort / Quick Wins

  1. Generify duplicate-ID validators using Go generics

    • Single validateNoDuplicateIDs[T] in validation_helpers.go replaces validateNoDuplicateCacheIDs and validateNoDuplicateMemoryIDs
  2. Remove one-liner wrapper functions

    • Delete pkg/cli/pr_command.go:parsePRURL (wraps parser.ParsePRURL)
    • Inline or remove parseGoModWithIndirect (marked "backward compatibility")

Implementation Checklist

  • Move expression validation primitives from concurrency_validation.goexpression_validation.go
  • Create cache_validation.go, jobs_validation.go, lock_validation.go, repo_memory_validation.go and move functions
  • Move ValidateGitHubToolsAgainstToolsetstools_validation.go
  • Move ValidateIncludedPermissionspermissions_validation.go
  • Remove pkg/cli/pr_command.go:parsePRURL wrapper
  • Remove parseGoModWithIndirect backward-compat wrapper
  • Generify duplicate-ID validators with Go generics
  • Document or rename validateWorkflowName vs ValidateWorkflowName distinction
  • Evaluate registry pattern for 43 parseXxxConfig methods
  • Consolidate GitHub URL parsing under pkg/parser

References: §23014619171

Generated by Semantic Function Refactoring ·

  • expires on Mar 14, 2026, 5:23 PM UTC

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions