Summary
Group a flat list of DiscoveredItems into DraftFeature clusters. This is the core intelligence of Phase 2. Uses file path proximity, naming conventions, and API path prefixes — no ML or LLM required.
Depends on: #124, and at least one miner (#127–#132)
New file
src/specleft/discovery/grouping.py
from specleft.discovery.models import (
DiscoveredItem, DraftFeature, DraftScenario, ItemKind,
TestFunctionMeta, ApiRouteMeta, GitCommitMeta,
)
def group_items(items: list[DiscoveredItem]) -> list[DraftFeature]: ...
Grouping strategy (applied in priority order)
1. File-path grouping (primary)
Items whose file_path shares a common directory segment form an initial group. Group key = the most specific shared directory name.
tests/auth/test_login.py
tests/auth/test_logout.py → group key: "auth"
2. API path prefix grouping
Route items (kind=API_ROUTE) — use item.typed_meta() to get ApiRouteMeta and extract path. Group by first path segment.
GET /users/{id}
POST /users → group key: "users"
DELETE /users/{id}
3. Name-prefix grouping (fallback)
Items whose name shares a common prefix after stripping test_, test , it , etc.
test_payment_success
test_payment_declined → group key: "payment"
4. Git history cross-reference
Git items (kind=GIT_COMMIT) — use item.typed_meta() to get GitCommitMeta and extract file_prefixes. Merge into the existing group whose file paths overlap most. Unmatched git items form their own group only if they have >=3 commits pointing to the same prefix.
Typed metadata access
The grouping algorithm should use item.typed_meta() for type-safe access to metadata fields. This avoids dict["key"] lookups and ensures compile-time safety:
# Instead of:
path = item.metadata["path"] # KeyError risk
prefixes = item.metadata["file_prefixes"] # KeyError risk
# Use:
meta = item.typed_meta()
if isinstance(meta, ApiRouteMeta):
path = meta.path # type-safe
elif isinstance(meta, GitCommitMeta):
prefixes = meta.file_prefixes # type-safe
Group naming
Slugify the group key. Expand common abbreviations before slugifying:
auth → authentication
mgmt → management
cfg / config → configuration
notif → notifications
msg → messaging
DraftFeature.name = title-cased expanded label (e.g. "User Authentication").
DraftFeature.feature_id = slugified (e.g. "user-authentication").
Confidence scoring
| Signal |
Bonus |
| Items from >=2 different miners |
+0.2 |
| Group has >=1 docstring item |
+0.1 |
| Group has >=1 git item corroborating |
+0.1 |
| Base score |
0.5 |
| Maximum |
1.0 |
Acceptance criteria
Summary
Group a flat list of
DiscoveredItems intoDraftFeatureclusters. This is the core intelligence of Phase 2. Uses file path proximity, naming conventions, and API path prefixes — no ML or LLM required.Depends on: #124, and at least one miner (#127–#132)
New file
src/specleft/discovery/grouping.pyGrouping strategy (applied in priority order)
1. File-path grouping (primary)
Items whose
file_pathshares a common directory segment form an initial group. Group key = the most specific shared directory name.2. API path prefix grouping
Route items (
kind=API_ROUTE) — useitem.typed_meta()to getApiRouteMetaand extractpath. Group by first path segment.3. Name-prefix grouping (fallback)
Items whose
nameshares a common prefix after strippingtest_,test,it, etc.4. Git history cross-reference
Git items (
kind=GIT_COMMIT) — useitem.typed_meta()to getGitCommitMetaand extractfile_prefixes. Merge into the existing group whose file paths overlap most. Unmatched git items form their own group only if they have >=3 commits pointing to the same prefix.Typed metadata access
The grouping algorithm should use
item.typed_meta()for type-safe access to metadata fields. This avoidsdict["key"]lookups and ensures compile-time safety:Group naming
Slugify the group key. Expand common abbreviations before slugifying:
auth→authenticationmgmt→managementcfg/config→configurationnotif→notificationsmsg→messagingDraftFeature.name= title-cased expanded label (e.g."User Authentication").DraftFeature.feature_id= slugified (e.g."user-authentication").Confidence scoring
Acceptance criteria
tests/auth/all land in a single groupGET /payments/*andPOST /paymentsform a separate group fromusersDiscoveredItemappears in exactly oneDraftFeatureGitCommitMeta.file_prefixes, not siloeditem.typed_meta()for type-safe metadata access (no rawmetadata["key"])tests/discovery/test_grouping.pyusing syntheticDiscoveredItemfixturesfeatures/feature-spec-discovery.mdto cover the functionality introduced by this issue