WIP: Combine many remote to few remote module groups#6061
WIP: Combine many remote to few remote module groups#6061hjmjohnson wants to merge 2 commits intoInsightSoftwareConsortium:mainfrom
Conversation
3491800 to
d7f44e1
Compare
|
|
||
| itk_fetch_module_group(Analysis | ||
| "Analysis domain: TextureFeatures, BoneMorphometry, BoneEnhancement, Thickness3D, IsotropicWavelets, RANSAC, PerformanceBenchmarking, StructuralSimilarity" | ||
| GIT_REPOSITORY https://github.com/hjmjohnson/ITKRemoteAnalysis.git |
There was a problem hiding this comment.
I think it would be helpful to bring these modules into the ITK repository. But not dump them in a different repository, which complicates ownership and maintenance. Or, create a Remote Module Group, that groups the individual repositories.
There was a problem hiding this comment.
Adding them to the main repository would make sense too, especially for the more well established ones. If doing so, we would need to expand on the concept of "Review" module (which was the path to entering the main repository in the past), including building everything into python packages.
There was a problem hiding this comment.
For the remote modules for which we have strong evidence that only the core developers maintain them, I would 100% support bringing them into the ITK repository. That is an enormous maintenance burden relief! I understand the rationale behind the remote modules, but the implementation realities are that it is an extraordinary developer burden.
There was a problem hiding this comment.
I am not sure why effort is being put into consolidating the modules into a single module without bring the modules into ITK. If the rationale that it is an extraordinary developer burdern for ITK developers to maintain all the ITK Remote Modules as Remote modules (and the assumption is that ITK core developers should maintain all Remote Modules, which I did not intend to be the case), then bring them into the ITK repository. Part of the rationale behind remote modules is that non-ITK core developers can create and maintain them outside the ITK repository. This is lost with such an ITKRemoteAnalysis repository.
Add hierarchical sub-group support for remote modules, enabling consolidation of many single-module repos into category repos. CMake changes (~160 lines across 3 files): - ITKModuleEnablement.cmake: add 4-level glob for sub-grouped modules - ITKModuleRemote.cmake: new itk_fetch_module_group() function - ITKGroups.cmake: sub-group request propagation for ITKGroup_Remote_* Proof of concept: ITKRemoteAnalysis replaces 7 individual .remote.cmake files with a single category repo containing 8 modules (TextureFeatures, BoneMorphometry, BoneEnhancement, Thickness3D, IsotropicWavelets, RANSAC, PerformanceBenchmarking, StructuralSimilarity). All 77 tests pass. Usage: cmake -DITKGroup_Remote_Analysis=ON ... # Modules with EXCLUDE_FROM_DEFAULT need explicit Module_<name>=ON Ref: InsightSoftwareConsortium#6060
- Point GIT_REPOSITORY at hjmjohnson/ITKRemoteAnalysis on GitHub (was local path, inaccessible to CI) - Enable ITKGroup_Remote_Analysis and all EXCLUDE_FROM_DEFAULT modules in Pixi configure-ci so CI builds and tests all 8 analysis modules This is a WIP commit for CI validation — the pyproject.toml changes will be reverted before merge. The final .remote.cmake will point at InsightSoftwareConsortium/ITKRemoteAnalysis. Ref: InsightSoftwareConsortium#6060
d7f44e1 to
c0a9400
Compare
|
rejected, completely different approach will need to be looked into. |
|
Superseded by #6085, which implements the inline-ingest strategy that reviewers requested in the discussion above (bring modules into the ITK repository rather than consolidating them into a separate group repo). Keeping this PR closed and leaving all prior comments visible — they are the direct motivation for the new approach. |
Four long-form documents landing alongside ingest-remote-module.sh:
INGESTION_STRATEGY.md
Policy document. Whitelist definition, mode selection (full /
filtered / squash), attribution floor, CID-normalization pipeline,
examples-relocation policy. Codifies the PR InsightSoftwareConsortium#6093 consensus that
commit count is NOT a gate -- only size metrics are -- so modules
with hundreds of genuine upstream commits can land in full-history
mode as long as pack-delta and blob-size thresholds are met.
AUDIT_DESIGN.md
Design notes for the pre-ingest audit pass: blob-size histogram,
strip-candidate path detection (paths present in pre-tip history
but absent in tip), copyright-review flag for PDFs / videos /
large images, recommend_mode() pseudocode.
CLEANUP_CHECKLIST.md
What the post-merge STYLE commit checks (now a safety-net since
the whitelist handles the common case at graft time). Still used
for copyright review and for Mode B residual-blob stripping.
AGENTS.md
Guidance for AI coding agents running this workflow. Pre-flight
gates, decision points (non-Apache license, raw binary test assets,
non-whitelisted paths the module needs, CID-normalization gap,
examples/ routing), escalation triggers for handing back to the
human. Explicit "don't do these things" section covers common
pitfalls (re-squash-silently, widen-whitelist-without-documenting,
force-push-ingest-PR).
These documents were developed and iterated on across PR InsightSoftwareConsortium#6061,
InsightSoftwareConsortium#6085, InsightSoftwareConsortium#6086, and especially InsightSoftwareConsortium#6093 (the thread that produced the v3
whitelist + CID-normalization approach). Landing them in-tree under
Utilities/Maintenance/RemoteModuleIngest/ makes future changes to the
strategy reviewable via standard PR process rather than through PR
comment updates on long-running threads.
Four long-form documents landing alongside ingest-remote-module.sh:
INGESTION_STRATEGY.md
Policy document. Whitelist definition, mode selection (full /
filtered / squash), attribution floor, CID-normalization pipeline,
examples-relocation policy. Codifies the PR InsightSoftwareConsortium#6093 consensus that
commit count is NOT a gate -- only size metrics are -- so modules
with hundreds of genuine upstream commits can land in full-history
mode as long as pack-delta and blob-size thresholds are met.
AUDIT_DESIGN.md
Design notes for the pre-ingest audit pass: blob-size histogram,
strip-candidate path detection (paths present in pre-tip history
but absent in tip), copyright-review flag for PDFs / videos /
large images, recommend_mode() pseudocode.
CLEANUP_CHECKLIST.md
What the post-merge STYLE commit checks (now a safety-net since
the whitelist handles the common case at graft time). Still used
for copyright review and for Mode B residual-blob stripping.
AGENTS.md
Guidance for AI coding agents running this workflow. Pre-flight
gates, decision points (non-Apache license, raw binary test assets,
non-whitelisted paths the module needs, CID-normalization gap,
examples/ routing), escalation triggers for handing back to the
human. Explicit "don't do these things" section covers common
pitfalls (re-squash-silently, widen-whitelist-without-documenting,
force-push-ingest-PR).
These documents were developed and iterated on across PR InsightSoftwareConsortium#6061,
InsightSoftwareConsortium#6085, InsightSoftwareConsortium#6086, and especially InsightSoftwareConsortium#6093 (the thread that produced the v3
whitelist + CID-normalization approach). Landing them in-tree under
Utilities/Maintenance/RemoteModuleIngest/ makes future changes to the
strategy reviewable via standard PR process rather than through PR
comment updates on long-running threads.
Four long-form documents landing alongside ingest-remote-module.sh:
INGESTION_STRATEGY.md
Policy document. Whitelist definition, mode selection (full /
filtered / squash), attribution floor, CID-normalization pipeline,
examples-relocation policy. Codifies the PR InsightSoftwareConsortium#6093 consensus that
commit count is NOT a gate -- only size metrics are -- so modules
with hundreds of genuine upstream commits can land in full-history
mode as long as pack-delta and blob-size thresholds are met.
AUDIT_DESIGN.md
Design notes for the pre-ingest audit pass: blob-size histogram,
strip-candidate path detection (paths present in pre-tip history
but absent in tip), copyright-review flag for PDFs / videos /
large images, recommend_mode() pseudocode.
CLEANUP_CHECKLIST.md
What the post-merge STYLE commit checks (now a safety-net since
the whitelist handles the common case at graft time). Still used
for copyright review and for Mode B residual-blob stripping.
AGENTS.md
Guidance for AI coding agents running this workflow. Pre-flight
gates, decision points (non-Apache license, raw binary test assets,
non-whitelisted paths the module needs, CID-normalization gap,
examples/ routing), escalation triggers for handing back to the
human. Explicit "don't do these things" section covers common
pitfalls (re-squash-silently, widen-whitelist-without-documenting,
force-push-ingest-PR).
These documents were developed and iterated on across PR InsightSoftwareConsortium#6061,
InsightSoftwareConsortium#6085, InsightSoftwareConsortium#6086, and especially InsightSoftwareConsortium#6093 (the thread that produced the v3
whitelist + CID-normalization approach). Landing them in-tree under
Utilities/Maintenance/RemoteModuleIngest/ makes future changes to the
strategy reviewable via standard PR process rather than through PR
comment updates on long-running threads.
Four long-form documents landing alongside ingest-remote-module.sh:
INGESTION_STRATEGY.md
Policy document. Whitelist definition, mode selection (full /
filtered / squash), attribution floor, CID-normalization pipeline,
examples-relocation policy. Codifies the PR InsightSoftwareConsortium#6093 consensus that
commit count is NOT a gate -- only size metrics are -- so modules
with hundreds of genuine upstream commits can land in full-history
mode as long as pack-delta and blob-size thresholds are met.
AUDIT_DESIGN.md
Design notes for the pre-ingest audit pass: blob-size histogram,
strip-candidate path detection (paths present in pre-tip history
but absent in tip), copyright-review flag for PDFs / videos /
large images, recommend_mode() pseudocode.
CLEANUP_CHECKLIST.md
What the post-merge STYLE commit checks (now a safety-net since
the whitelist handles the common case at graft time). Still used
for copyright review and for Mode B residual-blob stripping.
AGENTS.md
Guidance for AI coding agents running this workflow. Pre-flight
gates, decision points (non-Apache license, raw binary test assets,
non-whitelisted paths the module needs, CID-normalization gap,
examples/ routing), escalation triggers for handing back to the
human. Explicit "don't do these things" section covers common
pitfalls (re-squash-silently, widen-whitelist-without-documenting,
force-push-ingest-PR).
These documents were developed and iterated on across PR InsightSoftwareConsortium#6061,
InsightSoftwareConsortium#6085, InsightSoftwareConsortium#6086, and especially InsightSoftwareConsortium#6093 (the thread that produced the v3
whitelist + CID-normalization approach). Landing them in-tree under
Utilities/Maintenance/RemoteModuleIngest/ makes future changes to the
strategy reviewable via standard PR process rather than through PR
comment updates on long-running threads.
Four long-form documents landing alongside ingest-remote-module.sh:
INGESTION_STRATEGY.md
Policy document. Whitelist definition, mode selection (full /
filtered / squash), attribution floor, CID-normalization pipeline,
examples-relocation policy. Codifies the PR InsightSoftwareConsortium#6093 consensus that
commit count is NOT a gate -- only size metrics are -- so modules
with hundreds of genuine upstream commits can land in full-history
mode as long as pack-delta and blob-size thresholds are met.
AUDIT_DESIGN.md
Design notes for the pre-ingest audit pass: blob-size histogram,
strip-candidate path detection (paths present in pre-tip history
but absent in tip), copyright-review flag for PDFs / videos /
large images, recommend_mode() pseudocode.
CLEANUP_CHECKLIST.md
What the post-merge STYLE commit checks (now a safety-net since
the whitelist handles the common case at graft time). Still used
for copyright review and for Mode B residual-blob stripping.
AGENTS.md
Guidance for AI coding agents running this workflow. Pre-flight
gates, decision points (non-Apache license, raw binary test assets,
non-whitelisted paths the module needs, CID-normalization gap,
examples/ routing), escalation triggers for handing back to the
human. Explicit "don't do these things" section covers common
pitfalls (re-squash-silently, widen-whitelist-without-documenting,
force-push-ingest-PR).
These documents were developed and iterated on across PR InsightSoftwareConsortium#6061,
InsightSoftwareConsortium#6085, InsightSoftwareConsortium#6086, and especially InsightSoftwareConsortium#6093 (the thread that produced the v3
whitelist + CID-normalization approach). Landing them in-tree under
Utilities/Maintenance/RemoteModuleIngest/ makes future changes to the
strategy reviewable via standard PR process rather than through PR
comment updates on long-running threads.
WIP proof of concept for #6060: hierarchical remote module sub-groups.
Adds
itk_fetch_module_group()and demonstrates consolidating 7 individual remote repos + SSIM into a singleITKRemoteAnalysiscategory repo with 8 modules and 1660 preserved commits. All 77 module tests pass locally.CMake changes (~160 lines, 3 files)
CMake/ITKModuleEnablement.cmake— add 4-level glob soModules/Remote/Analysis/TextureFeatures/itk-module.cmakeis discoveredCMake/ITKModuleRemote.cmake— newitk_fetch_module_group()function: clones a category repo intoModules/Remote/<Category>/, respects freeze and tag overrideCMake/ITKGroups.cmake— sub-group request propagation: whenITKGroup_Remote_Analysis=ON, all modules in the Analysis directory are requested for buildUsage
Test results (77/77 pass)
What's NOT in this PR (future work)
ITKRemoteAnalysisrepo is local only — needs ISC org repo creation and push.remote.cmakecurrently points at a local path; will be updated to GitHub URL