Summary
The orchestrator wires together all miners and produces a DiscoveryReport. It is the single entry point for both specleft discover and specleft start. Miners are registered by ID; the orchestrator runs them sequentially, catches per-miner failures, and merges results.
This issue also introduces the MinerContext, FrameworkDetector, and DiscoveryConfig — shared infrastructure that all miners consume.
Depends on: #124, #125
New files
src/specleft/discovery/config.py
from specleft.discovery.file_index import DEFAULT_EXCLUDE_DIRS
@dataclass(frozen=True)
class DiscoveryConfig:
"""
User-facing discovery settings. Loaded from pyproject.toml
[tool.specleft.discovery] section, with sensible defaults.
"""
exclude_dirs: frozenset[str] = DEFAULT_EXCLUDE_DIRS
source_dirs: tuple[str, ...] = ("src", "lib", "app", "core")
max_git_commits: int = 200
@classmethod
def from_pyproject(cls, root: Path) -> "DiscoveryConfig":
"""Parse [tool.specleft.discovery] from pyproject.toml at root.
Falls back to defaults if section is missing."""
@classmethod
def default(cls) -> "DiscoveryConfig":
"""Return config with all defaults."""
Supported pyproject.toml section:
[tool.specleft.discovery]
exclude_dirs = [".git", "node_modules", "__pycache__", ".venv", "dist", "build"]
source_dirs = ["src", "lib", "app"]
max_git_commits = 200
src/specleft/discovery/framework_detector.py
from specleft.discovery.models import SupportedLanguage
from specleft.discovery.file_index import FileIndex
class FrameworkDetector:
"""
Centralised framework detection. Hybrid approach:
1. Parse manifest files (pyproject.toml, package.json)
2. Confirm via file patterns if ambiguous
3. File patterns win on conflict
Used by the pipeline (passed to miners via MinerContext)
and by the `specleft start` command directly.
"""
def detect(
self,
root: Path,
file_index: FileIndex,
) -> dict[SupportedLanguage, list[str]]:
"""
Returns e.g.:
{
SupportedLanguage.PYTHON: ["pytest"],
SupportedLanguage.TYPESCRIPT: ["jest"],
}
"""
def _detect_python_frameworks(self, root: Path, file_index: FileIndex) -> list[str]: ...
def _detect_typescript_frameworks(self, root: Path, file_index: FileIndex) -> list[str]: ...
Detection logic (single source of truth):
Python:
- Parse
pyproject.toml [tool.pytest.ini_options] or [build-system] deps for pytest
- Check
requirements*.txt for pytest, unittest (stdlib, always available)
- Confirm:
test_*.py or conftest.py present → pytest; TestCase subclasses → unittest
- Conflict:
pyproject.toml says pytest but zero test_*.py files → report "unknown"
TypeScript/JavaScript:
- Parse
package.json devDependencies for jest, vitest, @jest/core, @vitest/ui
- Confirm:
jest.config.* → jest; vite.config.* + *.test.* → vitest
- Both genuinely present → report both
src/specleft/discovery/context.py
@dataclass(frozen=True)
class MinerContext:
"""
Immutable context passed to every miner's mine() method.
New fields can be added in future versions without breaking
the mine() signature.
"""
root: Path
registry: LanguageRegistry
file_index: FileIndex
frameworks: dict[SupportedLanguage, list[str]]
config: DiscoveryConfig
src/specleft/discovery/pipeline.py
import uuid
from specleft.discovery.models import SupportedLanguage, MinerResult, DiscoveryReport
from specleft.discovery.context import MinerContext
class BaseMiner(Protocol):
miner_id: uuid.UUID # class-level UUID4 constant
name: str # human-readable identifier
languages: frozenset[SupportedLanguage] # empty = language-agnostic
def mine(self, ctx: MinerContext) -> MinerResult: ...
class DiscoveryPipeline:
def __init__(self, root: Path, languages: list[SupportedLanguage] | None = None) -> None: ...
def register(self, miner: BaseMiner) -> None: ...
def run(self) -> DiscoveryReport: ...
def build_default_pipeline(
root: Path,
languages: list[SupportedLanguage] | None = None,
config: DiscoveryConfig | None = None,
) -> DiscoveryPipeline:
"""
Constructs pipeline with all built-in miners registered.
Builds FileIndex, detects frameworks, constructs MinerContext.
If languages=None, auto-detects. If config=None, loads from pyproject.toml.
"""
BaseMiner protocol contract
| Field |
Type |
Purpose |
miner_id |
uuid.UUID |
Class-level UUID4 constant. Collision-proof across built-in and future third-party miners. |
name |
str |
Human-readable label (e.g. "python_test_functions"). Used in JSON output, error messages, logging. |
languages |
frozenset[SupportedLanguage] |
Immutable set of languages this miner handles. Empty = language-agnostic (runs for all projects). |
Runner behaviour
- Pipeline builds
FileIndex once, runs FrameworkDetector once, constructs MinerContext
- Miners run sequentially (parallelism deferred to a future optimisation)
- A miner that raises captures the exception into
MinerResult.error + MinerResult.error_kind; pipeline continues
- Pipeline populates
MinerResult.miner_id and MinerResult.miner_name from the miner instance
DiscoveryReport.duration_ms is wall-clock time for the entire run
DiscoveryReport.total_items equals the sum of len(r.items) for all non-error results
- Miners whose
languages frozenset has no overlap with detected languages are skipped silently
- Language-agnostic miners (
languages = frozenset()) always run
Registration validation
register() rejects a miner whose miner_id UUID is already registered (raise ValueError)
Acceptance criteria
Summary
The orchestrator wires together all miners and produces a
DiscoveryReport. It is the single entry point for bothspecleft discoverandspecleft start. Miners are registered by ID; the orchestrator runs them sequentially, catches per-miner failures, and merges results.This issue also introduces the
MinerContext,FrameworkDetector, andDiscoveryConfig— shared infrastructure that all miners consume.Depends on: #124, #125
New files
src/specleft/discovery/config.pySupported
pyproject.tomlsection:src/specleft/discovery/framework_detector.pyDetection logic (single source of truth):
Python:
pyproject.toml[tool.pytest.ini_options]or[build-system]deps forpytestrequirements*.txtforpytest,unittest(stdlib, always available)test_*.pyorconftest.pypresent →pytest;TestCasesubclasses →unittestpyproject.tomlsays pytest but zerotest_*.pyfiles → report"unknown"TypeScript/JavaScript:
package.jsondevDependenciesforjest,vitest,@jest/core,@vitest/uijest.config.*→jest;vite.config.*+*.test.*→vitestsrc/specleft/discovery/context.pysrc/specleft/discovery/pipeline.pyBaseMinerprotocol contractminer_iduuid.UUIDnamestr"python_test_functions"). Used in JSON output, error messages, logging.languagesfrozenset[SupportedLanguage]Runner behaviour
FileIndexonce, runsFrameworkDetectoronce, constructsMinerContextMinerResult.error+MinerResult.error_kind; pipeline continuesMinerResult.miner_idandMinerResult.miner_namefrom the miner instanceDiscoveryReport.duration_msis wall-clock time for the entire runDiscoveryReport.total_itemsequals the sum oflen(r.items)for all non-error resultslanguagesfrozenset has no overlap with detected languages are skipped silentlylanguages = frozenset()) always runRegistration validation
register()rejects a miner whoseminer_idUUID is already registered (raiseValueError)Acceptance criteria
DiscoveryConfig.from_pyproject(root)loads custom settings from[tool.specleft.discovery]DiscoveryConfig.from_pyproject(root)returns defaults when section is missingFrameworkDetector.detect()returns{PYTHON: ["pytest"]}on the specleft repoFrameworkDetectoris called once by the pipeline, result passed to all miners viaMinerContextMinerContextis constructed once and shared across all miner callsbuild_default_pipeline(root).run()returns aDiscoveryReporteven if every miner errorsMinerResult.error+error_kindtotal_itemsis correct when some miners have errors (error items excluded from count)languages=frozenset()) always runregister()raisesValueErrorifminer_idUUID already registeredMinerResult.miner_idmatches the miner's UUID;MinerResult.miner_namematches itsnamereport.total_items > 0tests/discovery/test_pipeline.py,tests/discovery/test_framework_detector.py,tests/discovery/test_config.pyfeatures/feature-spec-discovery.mdto cover the functionality introduced by this issue