|
| 1 | +PROJECT IDEA — ARGUS TRACE |
| 2 | + |
| 3 | +Objective |
| 4 | + |
| 5 | +Build a local-first evidence trace and discrepancy review system for construction |
| 6 | +delivery artifacts. |
| 7 | + |
| 8 | +The system ingests schedule exports, meeting minutes, basis documents, action |
| 9 | +logs, and selected transmittals, then derives a traceable discrepancy graph |
| 10 | +showing where commitments, dates, and scope statements diverge across sources. |
| 11 | + |
| 12 | +The primary goal is not general chat or document search. |
| 13 | +The goal is controlled contradiction detection, trace surfacing, and reviewable |
| 14 | +evidence bundles. |
| 15 | + |
| 16 | +The system must allow a user to ask questions such as: |
| 17 | + |
| 18 | +- What dates for Module Set A energization are stated across all sources? |
| 19 | +- Which commitments in meeting minutes are unsupported by the current basis? |
| 20 | +- What changed between Revision B and Revision C of the execution narrative? |
| 21 | +- Which action items appear to conflict with the approved schedule logic? |
| 22 | +- Where do downstream summaries appear to overstate source certainty? |
| 23 | + |
| 24 | +The result is a governed local application that produces deterministic trace |
| 25 | +artifacts, discrepancy records, and human-review packets. |
| 26 | + |
| 27 | + |
| 28 | +Scope |
| 29 | + |
| 30 | +In Scope |
| 31 | +- local ingestion of selected project artifacts |
| 32 | +- normalization into canonical text artifacts |
| 33 | +- source registration with immutable provenance records |
| 34 | +- extraction of claims, dates, entities, and commitments |
| 35 | +- derivation of cross-document trace links |
| 36 | +- contradiction / discrepancy detection |
| 37 | +- confidence-scored review queues |
| 38 | +- exportable evidence packets for human validation |
| 39 | +- resumable project state and deterministic re-entry |
| 40 | +- minimal local UI or review surface |
| 41 | + |
| 42 | +Out of Scope |
| 43 | +- cloud sync |
| 44 | +- OCR for scanned image-only PDFs unless an explicit fallback phase is defined |
| 45 | +- autonomous correction of source documents |
| 46 | +- real-time collaboration |
| 47 | +- enterprise authentication |
| 48 | +- replacement of Primavera, Aconex, SharePoint, or document control systems |
| 49 | +- final truth adjudication without human review |
| 50 | + |
| 51 | + |
| 52 | +Desired End State |
| 53 | + |
| 54 | +A user can point the system at a controlled folder of project artifacts and obtain: |
| 55 | + |
| 56 | +1. canonical normalized artifacts |
| 57 | +2. claim/commitment extraction artifacts |
| 58 | +3. trace-link artifacts connecting related statements |
| 59 | +4. discrepancy records with cited evidence |
| 60 | +5. review packets for human adjudication |
| 61 | +6. a persisted state surface allowing deterministic continuation after interruption |
| 62 | + |
| 63 | + |
| 64 | +System Shape |
| 65 | + |
| 66 | +The system is composed of six logical layers: |
| 67 | + |
| 68 | +1. Intake Layer |
| 69 | + Accepts raw files and registers them. |
| 70 | + |
| 71 | +2. Canonicalization Layer |
| 72 | + Converts accepted files into normalized canonical text artifacts plus metadata. |
| 73 | + |
| 74 | +3. Extraction Layer |
| 75 | + Extracts structured statements such as commitments, dates, quantities, |
| 76 | + milestones, and responsibility assignments. |
| 77 | + |
| 78 | +4. Trace Layer |
| 79 | + Links extracted statements across artifacts using deterministic matching rules |
| 80 | + and bounded semantic assistance. |
| 81 | + |
| 82 | +5. Discrepancy Layer |
| 83 | + Detects divergence, contradiction, omission, and unsupported restatement. |
| 84 | + |
| 85 | +6. Review Layer |
| 86 | + Presents discrepancy records and evidence packets for human disposition. |
| 87 | + |
| 88 | +The exact boundary between Trace Layer and Discrepancy Layer is intentionally |
| 89 | +not fully fixed and may require clarification during roadmap generation. |
| 90 | + |
| 91 | + |
| 92 | +Primary Inputs |
| 93 | + |
| 94 | +Expected raw sources may include: |
| 95 | +- .pdf basis documents |
| 96 | +- .docx meeting minutes |
| 97 | +- .xlsx action registers |
| 98 | +- .xer-derived schedule exports converted to .xlsx or .csv |
| 99 | +- .txt or .md execution narratives |
| 100 | +- .msg exports if already converted to text |
| 101 | +- manually entered user notes placed in a controlled folder |
| 102 | + |
| 103 | +The exact supported minimum file set for a first runnable version is not fully |
| 104 | +defined and should be resolved during planning. |
| 105 | + |
| 106 | + |
| 107 | +Primary Outputs |
| 108 | + |
| 109 | +Expected outputs include: |
| 110 | +- canonical text artifacts |
| 111 | +- provenance sidecars |
| 112 | +- extraction records |
| 113 | +- trace-link records |
| 114 | +- discrepancy records |
| 115 | +- review queue indexes |
| 116 | +- review packet bundles |
| 117 | +- state and run manifests |
| 118 | +- failure reports for rejected files |
| 119 | + |
| 120 | + |
| 121 | +Hard Invariants |
| 122 | + |
| 123 | +1. Filesystem Authority |
| 124 | + Repository artifacts are authoritative. |
| 125 | + Conversational memory is non-authoritative. |
| 126 | + |
| 127 | +2. Write-Once Evidence |
| 128 | + Canonical evidence artifacts are immutable once emitted. |
| 129 | + Corrections create successor artifacts rather than in-place mutation. |
| 130 | + |
| 131 | +3. Stable Provenance |
| 132 | + Every derived artifact must point to its direct parent artifact(s). |
| 133 | + |
| 134 | +4. Deterministic Re-Entry |
| 135 | + A new session must be able to resume from repository state alone. |
| 136 | + |
| 137 | +5. Review Before Promotion |
| 138 | + No discrepancy may be marked resolved without an explicit human disposition artifact. |
| 139 | + |
| 140 | +6. No Silent Merge |
| 141 | + If two extracted claims are collapsed into one trace identity, that merge must |
| 142 | + be represented explicitly in an artifact. |
| 143 | + |
| 144 | +7. Bounded Inference |
| 145 | + Language-model assistance may propose links or discrepancies, but proposals |
| 146 | + must be written as reviewable artifacts and never treated as accepted truth |
| 147 | + by default. |
| 148 | + |
| 149 | +8. Address Contract |
| 150 | + Canonical artifacts must be emitted beneath: |
| 151 | + |
| 152 | + 02_EXODUS/runtime_store/projects/<project_slug>/artifacts/ |
| 153 | + |
| 154 | + using a two-hex fan-out directory structure. |
| 155 | + |
| 156 | +9. Manifest Contract |
| 157 | + Every execution run must emit: |
| 158 | + |
| 159 | + 05_NUMBERS/runs/<run_id>/run_manifest.json |
| 160 | + |
| 161 | +10. Review Queue Contract |
| 162 | + Open discrepancy queue state must be reconstructible from filesystem artifacts |
| 163 | + even if any cache database is deleted. |
| 164 | + |
| 165 | +11. Packet Export Contract |
| 166 | + Every exported review packet must contain exactly one packet manifest, at least |
| 167 | + two cited evidence excerpts, and one machine-readable discrepancy record. |
| 168 | + |
| 169 | +12. No Cross-Project Bleed |
| 170 | + Artifacts from one project slug must never be linked into another project slug |
| 171 | + unless an explicit federation contract exists. |
| 172 | + |
| 173 | +13. Local Path Reservation |
| 174 | + The following path is reserved and must not be repurposed: |
| 175 | + |
| 176 | + 04_DEUTERONOMY/canonical_schemas/trace_identity.schema.json |
| 177 | + |
| 178 | +14. Human Label Constraint |
| 179 | + A reviewer may assign custom labels, but system behavior must not depend on |
| 180 | + free-text labels alone. |
| 181 | + |
| 182 | + |
| 183 | +Soft / Ambiguous Constraints |
| 184 | + |
| 185 | +These are intentionally under-specified and should force roadmap clarification: |
| 186 | + |
| 187 | +- “Important” contradictions should surface first, but importance is not yet formally defined. |
| 188 | +- Similar dates may or may not represent the same milestone depending on source context. |
| 189 | +- Meeting minutes may be treated as lower authority than approved basis documents, |
| 190 | + but that authority ladder is not fully frozen. |
| 191 | +- Some schedule-derived dates may be considered operational rather than contractual, |
| 192 | + though the distinction is not yet formalized. |
| 193 | +- The first UI may be a browser surface, terminal workflow, or static review export. |
| 194 | +- It is preferred that extraction be deterministic where possible, but some bounded |
| 195 | + semantic interpretation is acceptable if explicitly recorded. |
| 196 | +- A discrepancy may include omission, contradiction, unsupported summary, or drift, |
| 197 | + but the exact taxonomy may need refinement. |
| 198 | +- The system should be “fast enough for practical use” on a local workstation, but |
| 199 | + no explicit performance threshold is yet fixed. |
| 200 | +- The initial supported project size may be a few hundred files or several thousand; |
| 201 | + this is not locked. |
| 202 | +- It is unclear whether transmittals should be treated as evidence, metadata only, |
| 203 | + or intake-control artifacts. |
| 204 | + |
| 205 | + |
| 206 | +Seeded Governance Tension |
| 207 | + |
| 208 | +The project intentionally contains a few tensions that planning must resolve: |
| 209 | + |
| 210 | +- deterministic extraction vs model-assisted trace proposals |
| 211 | +- immutable canonical artifacts vs iterative review outcomes |
| 212 | +- local filesystem authority vs optional cache/index acceleration |
| 213 | +- strong provenance vs practical usability |
| 214 | +- review packet export vs minimal first runnable scope |
| 215 | + |
| 216 | + |
| 217 | +Proposed Artifact Families |
| 218 | + |
| 219 | +Potential artifact families include: |
| 220 | +- source_registration |
| 221 | +- canonical_text |
| 222 | +- canonical_meta |
| 223 | +- extraction_claim |
| 224 | +- extraction_commitment |
| 225 | +- extraction_date |
| 226 | +- trace_identity |
| 227 | +- trace_edge |
| 228 | +- discrepancy_record |
| 229 | +- discrepancy_packet |
| 230 | +- human_disposition |
| 231 | +- run_manifest |
| 232 | +- rejection_record |
| 233 | + |
| 234 | +These names are suggestive, not final. |
| 235 | +Planning may revise them if done explicitly. |
| 236 | + |
| 237 | + |
| 238 | +Invented Example Constraints |
| 239 | + |
| 240 | +- Project slug example: west_delta_demo |
| 241 | +- Reserved reviewer ID format: RVW-### |
| 242 | +- Discrepancy ID prefix: DISC- |
| 243 | +- Trace identity ID prefix: TID- |
| 244 | +- Packet ID prefix: PKT- |
| 245 | +- Run IDs should be time-sortable |
| 246 | +- At least one artifact should preserve exact source excerpt byte offsets if available |
| 247 | +- If byte offsets are unavailable, the fallback locator format is not yet fixed |
| 248 | +- A “red folder” intake class may exist for disputed documents, but behavior is not defined |
| 249 | +- One future integration path may target a local graph store at: |
| 250 | + |
| 251 | + 02_EXODUS/runtime_store/graph_cache/ |
| 252 | + |
| 253 | + but this integration is not required for the first executable version |
| 254 | + |
| 255 | + |
| 256 | +Failure Philosophy |
| 257 | + |
| 258 | +- Fail closed on provenance uncertainty. |
| 259 | +- Fail open on optional enrichment. |
| 260 | +- Reject unsupported file types explicitly. |
| 261 | +- Do not discard contradictory evidence merely because a higher-authority source exists. |
| 262 | +- Preserve rejected-input records for audit. |
| 263 | +- Prefer explicit review artifacts over hidden runtime judgment. |
| 264 | + |
| 265 | + |
| 266 | +Example User Outcomes |
| 267 | + |
| 268 | +A planner or controls lead should be able to: |
| 269 | +- review all conflicting milestone dates for a named deliverable |
| 270 | +- inspect the exact excerpts that created a discrepancy |
| 271 | +- export a packet for team review |
| 272 | +- resume prior work after interruption without conversational context |
| 273 | +- distinguish machine-proposed links from human-accepted conclusions |
| 274 | + |
| 275 | + |
| 276 | +Open Questions Intentionally Left Unresolved |
| 277 | + |
| 278 | +- What exact authority order should govern source classes? |
| 279 | +- What minimum artifact set defines “Phase 1 runnable”? |
| 280 | +- Should schedule logic be parsed structurally or only through exported text tables? |
| 281 | +- Is the first review surface an app, CLI flow, or packet-only workflow? |
| 282 | +- What discrepancy categories are mandatory in v1? |
| 283 | +- What exact success criteria define acceptable trace precision? |
| 284 | +- When should semantic assistance be allowed to create candidate links? |
| 285 | +- How are superseded source revisions detected and represented? |
| 286 | + |
| 287 | +Success Condition |
| 288 | + |
| 289 | +The project succeeds when a bounded local system can ingest a controlled sample |
| 290 | +set, produce canonical and trace artifacts, surface discrepancy records with |
| 291 | +evidence, and support deterministic review continuation using repository state |
| 292 | +alone. |
0 commit comments