Skip to content

Qitor/owl

Repository files navigation

OWL

OWL is a frontier AI risk signal platform for monitoring public papers and blogs, normalizing them into durable markdown, extracting evidence-backed findings, routing them through review, and publishing reviewed signals.

OWL 是一个面向前沿 AI 风险信号的监测平台,用于持续采集公开论文与博客,将其规范化为可追溯的 markdown,抽取有证据支撑的结构化发现,送入人工审核,并最终发布为正式信号。

English

What This Repository Contains

  • A Python backend for ingestion, scheduling, stage workers, review packaging, publishing, and openai-compatible LLM calls.
  • A FastAPI service for the API surface.
  • A TypeScript web application for the analyst and operations workbench.
  • A Textual TUI for operational monitoring and queue handling.
  • Repository-local design docs that serve as the system of record for architecture, product intent, and execution plans.

Core Product Shape

  • fetch -> normalize -> triage -> extract -> resolve -> review_package -> publish
  • Source monitoring is durable and restart-friendly.
  • Task state lives in the database rather than in ephemeral queue-only infrastructure.
  • Facts, interpretations, and review decisions are intentionally separated.
  • Published signals must remain traceable back to evidence spans and source documents.

Repository Layout

  • src/: Python application code.
  • web/: analyst and ops web UI.
  • sources/: declarative source registry.
  • prompts/: prompt assets used in extraction and resolution.
  • docs/: durable product, architecture, and runtime documentation.
  • tests/: backend and pipeline tests.
  • start.sh: local platform launcher for full-stack development.
  • owl.yaml: YAML-only runtime configuration.

First Read

  1. ARCHITECTURE.md
  2. docs/README.md
  3. AGENTS.md
  4. AGENT.md

Quick Start

Prerequisites:

  • Python 3.11+
  • Node.js + npm
  • Docker
  • Chromium runtime for Crawl4AI / Playwright

Install dependencies:

pip install -e .
cd web && npm install && cd ..
python -m playwright install chromium

Validate config and bootstrap:

PYTHONPATH=src python -m owl config validate --config owl.yaml
PYTHONPATH=src python -m owl db bootstrap --config owl.yaml
PYTHONPATH=src python -m owl source sync --config owl.yaml

Start the full local stack:

./start.sh start

Useful runtime commands:

./start.sh status
./start.sh logs worker-triage --follow
./start.sh logs worker-extract --follow
./start.sh rebuild
./start.sh stop

Daily Development Workflow

  1. Read the relevant design docs before changing architecture or lifecycle behavior.
  2. Prefer small, additive changes that preserve stage boundaries and recovery semantics.
  3. Update the matching doc when you change state machines, APIs, UX contracts, or terminology.
  4. Validate the slice you changed instead of relying on a purely theoretical change.
  5. Keep handoff-friendly commits and avoid mixing unrelated refactors with behavioral changes.

Validation

Backend:

python -m compileall src tests
PYTHONPATH=src pytest -q

Frontend:

cd web
npm test
npm run build
cd ..

Runtime Surfaces

  • Home: macro command center with trends, graph, and briefing snapshots.
  • Signals: published signal workbench.
  • Review: analyst validation and decision rail.
  • Inbox: document intake and routing view.
  • Tasks: per-source, per-document collection and processing tracker.
  • Ops: pipeline health, source health, and failure observability.

Collaboration And Handoff

  • Treat this repository as agent-first and documentation-first.
  • Keep durable decisions in docs/, not only in chat threads.
  • Use AGENTS.md as the short machine-readable entry point.
  • Use AGENT.md as the longer Codex and developer handoff guide.
  • When handing work to another developer, include:
    • the product goal
    • the exact files touched
    • the docs updated
    • the tests run
    • the known risks or next steps

中文

仓库内容

  • Python 后端:负责采集、调度、阶段 worker、审核打包、发布以及 openai-compatible LLM 调用。
  • FastAPI API 服务:提供前后端共享的数据接口。
  • TypeScript Web 工作台:面向分析师与运营的主要界面。
  • Textual TUI:用于运维观察与队列处理。
  • 仓库内设计文档:作为产品意图、架构设计和执行计划的事实来源。

核心流水线

  • fetch -> normalize -> triage -> extract -> resolve -> review_package -> publish
  • 来源采集、任务状态、阶段产物都以“可恢复、可重放”为前提设计。
  • 队列状态以数据库为准,而不是只依赖临时消息。
  • 事实解释审核决策 必须分层保存。
  • 已发布信号必须能追溯回证据片段和源文档。

目录结构

  • src/:Python 应用代码。
  • web/:分析师与运维 Web 界面。
  • sources/:声明式来源注册表。
  • prompts/:抽取、解析相关 prompt 资源。
  • docs/:耐久产品与架构文档。
  • tests/:后端与流水线测试。
  • start.sh:本地全平台启动脚本。
  • owl.yaml:唯一运行时 YAML 配置。

建议先读

  1. ARCHITECTURE.md
  2. docs/README.md
  3. AGENTS.md
  4. AGENT.md

快速启动

前置依赖:

  • Python 3.11+
  • Node.js 与 npm
  • Docker
  • Crawl4AI / Playwright 所需 Chromium 运行时

安装依赖:

pip install -e .
cd web && npm install && cd ..
python -m playwright install chromium

校验配置并建库:

PYTHONPATH=src python -m owl config validate --config owl.yaml
PYTHONPATH=src python -m owl db bootstrap --config owl.yaml
PYTHONPATH=src python -m owl source sync --config owl.yaml

一键启动整个平台:

./start.sh start

常用运行命令:

./start.sh status
./start.sh logs worker-triage --follow
./start.sh logs worker-extract --follow
./start.sh rebuild
./start.sh stop

日常开发方式

  1. 改架构、状态机、公共接口前,先读对应设计文档。
  2. 尽量做增量式修改,不破坏阶段边界和恢复语义。
  3. 任何影响术语、接口、生命周期、UX 契约的修改,都要同步更新文档。
  4. 用测试、构建或真实命令验证改动,而不是只停留在推理层。
  5. 提交时尽量让 handoff 成本低,不要把无关重构和行为变更混在一起。

验证命令

后端:

python -m compileall src tests
PYTHONPATH=src pytest -q

前端:

cd web
npm test
npm run build
cd ..

主要界面

  • Home:宏观态势总览、趋势图、图谱、briefing。
  • Signals:已发布信号工作台。
  • Review:人工审核与决策界面。
  • Inbox:文档摄入与路由视图。
  • Tasks:按来源、按文档追踪采集与处理过程。
  • Ops:流水线健康、来源健康和故障观测。

接力开发建议

  • 把仓库当作“agent-first + docs-first”的工程来维护。
  • 耐久决策优先写入 docs/,不要只存在聊天记录里。
  • AGENTS.md 保持短小,供 Codex 这类代理快速读取。
  • AGENT.md 作为更完整的 Codex / 开发者协作手册。
  • 向下一位开发者交接时,至少说明:
    • 本次目标
    • 修改了哪些文件
    • 更新了哪些文档
    • 跑了哪些测试
    • 还剩哪些风险或下一步

About

Frontier AI risk signaling system

Resources

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors