aturret · aturret · Feb 18, 2026 · Feb 18, 2026 · Feb 18, 2026 · Feb 18, 2026
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -5,65 +5,74 @@ on:
     branches:
       - main
 
-env:
-  APP_NAME: fastfetchbot
-  DOCKERHUB_REPO: aturret/fastfetchbot
-#   APP_VERSION: latest
-
-concurrency: 
+concurrency:
   group: fastfetchbot
   cancel-in-progress: true
 
 jobs:
-  docker:
+  build:
     runs-on: ubuntu-latest
+    permissions:
+      contents: read
+      packages: write
+    strategy:
+      matrix:
+        include:
+          - service: api
+            dockerfile: apps/api/Dockerfile
+            image_suffix: api
+          - service: telegram-bot
+            dockerfile: apps/telegram-bot/Dockerfile
+            image_suffix: telegram-bot
     steps:
-      -
-        name: Checkout
-        uses: actions/checkout@v2
-      -
-        name: Check commit message
+      - name: Checkout
+        uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+
+      - name: Check commit message
         id: check_message
         run: |
           MESSAGE=$(git log --format=%B -n 1 ${{ github.sha }})
           if [[ "$MESSAGE" == *"[github-action]"* ]]; then
-            echo "::set-output name=skip::true"
+            echo "skip=true" >> "$GITHUB_OUTPUT"
           else
-            echo "::set-output name=skip::false"
+            echo "skip=false" >> "$GITHUB_OUTPUT"
           fi
 
-      -
-        name: Set up QEMU
-        uses: docker/setup-qemu-action@v1
-      -
-        name: Set up Docker Buildx
-        uses: docker/setup-buildx-action@v2
-      -
-        name: Login to DockerHub
-        uses: docker/login-action@v1
+      - name: Set up QEMU
+        uses: docker/setup-qemu-action@v3
+
+      - name: Set up Docker Buildx
+        uses: docker/setup-buildx-action@v3
+
+      - name: Login to GitHub Container Registry
+        uses: docker/login-action@v3
         with:
-          username: ${{ secrets.DOCKERHUB_USERNAME }}
-          password: ${{ secrets.DOCKERHUB_TOKEN }}
-      -
-        name: Generate App Version
-        run: echo APP_VERSION=`git describe --tags --always` >> $GITHUB_ENV
-      -
-        name: Build and push
+          registry: ghcr.io
+          username: ${{ github.actor }}
+          password: ${{ secrets.GITHUB_TOKEN }}
+
+      - name: Generate App Version
+        run: echo "APP_VERSION=$(git describe --tags --always)" >> "$GITHUB_ENV"
+
+      - name: Build and push
         if: steps.check_message.outputs.skip == 'false'
-        uses: docker/build-push-action@v2
+        uses: docker/build-push-action@v6
         with:
           context: .
-          platforms: |
-            linux/amd64
+          file: ${{ matrix.dockerfile }}
+          platforms: linux/amd64
           push: true
           build-args: |
-            APP_NAME=${{ env.APP_NAME }}
             APP_VERSION=${{ env.APP_VERSION }}
           tags: |
-            ${{ env.DOCKERHUB_REPO }}:latest
-#             ${{ env.DOCKERHUB_REPO }}:${{ env.APP_VERSION }}
-      -
-        name: send curl request
-        run: |
-          curl -H 'Authorization: Bearer ${{ secrets.WATCHTOWER_TOKEN }}' ${{ secrets.WATCHTOWER_WEBHOOK_URL }}
+            ghcr.io/${{ github.repository }}-${{ matrix.image_suffix }}:latest
 
+  deploy:
+    needs: build
+    runs-on: ubuntu-latest
+    steps:
+      - name: Trigger Watchtower deployment
+        run: |
+          curl -H "Authorization: Bearer ${{ secrets.WATCHTOWER_TOKEN }}" ${{ secrets.WATCHTOWER_WEBHOOK_URL }}
diff --git a/README.md b/README.md
@@ -2,128 +2,191 @@ Demo: https://t.me/aturretrss_bot
 
 # FastFetchBot
 
-A social media fetch API based on [FastAPI](https://fastapi.tiangolo.com/), with Telegram Bot as the default client.
+A social media content fetching service with a Telegram Bot client, built as a monorepo with two microservices.
 
-Supported most mainstream social media platforms. You can get a permanent copy of the content by just sending the url to the bot.
+Send a social media URL to the bot, and it fetches and archives the content for you. Supports most mainstream social media platforms.
 
-Other separated microservices for this project:
+## Architecture
 
-- [FastFileExporter](https://github.com/aturret/FastFileExporter)
-- [FastFetchBot-Telegram-Bot](https://github.com/aturret/FastFetchBot-Telegram-Bot)
+FastFetchBot is organized as a UV workspace monorepo with three packages:
 
+```
+FastFetchBot/
+├── packages/shared/          # fastfetchbot-shared: common models, utilities, logger
+├── apps/api/                 # FastAPI server: scrapers, storage, routing
+├── apps/telegram-bot/        # Telegram Bot: webhook/polling, message handling
+├── app/                      # Legacy re-export wrappers (backward compatibility)
+├── pyproject.toml            # Root workspace configuration
+└── uv.lock                   # Lockfile for the entire workspace
+```
 
-## Installation
-
-### Docker (Recommended)
-
-Download the docker-compose.yml file and set the environment variables as the following section.
+| Service | Port | Description |
+|---------|------|-------------|
+| **API Server** (`apps/api/`) | 10450 | FastAPI app with all platform scrapers, file export, and storage |
+| **Telegram Bot** (`apps/telegram-bot/`) | 10451 | Receives messages via webhook or long polling, calls the API server |
 
-#### Env
+The Telegram Bot communicates with the API server over HTTP. In Docker, this is `http://api:10450`.
 
-Create a `.env` file at the same directory and set the [environment variables](#envrionment-variables).
+## Installation
 
-#### Local Telegram API Sever
+### Docker (Recommended)
 
-If you want to send documents that larger than 50MB, you need to run a local telegram api server. The `docker-compose.yml` file has already give you an example. You just need to fill the `TELEGRAM_API_ID` and `TELEGRAM_API_HASH` in the yml file. If you don't need it, just comment it out.
+1. Copy `docker-compose.template.yml` to `docker-compose.yml`.
+2. Create a `.env` file from `template.env` and fill in the [environment variables](#environment-variables).
+3. If you need large file support (>50 MB), fill in `TELEGRAM_API_ID` and `TELEGRAM_API_HASH` in the compose file for the local Telegram Bot API server. Otherwise, comment out the `telegram-bot-api` service.
 
 ```bash
 docker-compose up -d
 ```
 
-### Python (Not Recommended)
+The compose file pulls pre-built images from GitHub Container Registry:
 
-Local Telegram API sever and video download function is not supported in this way. If you do really need these functions, you can run the telegram api server and [the file export server](https://github.com/aturret/FastFileExporter) manually.
+- `ghcr.io/aturret/fastfetchbot-api:latest`
+- `ghcr.io/aturret/fastfetchbot-telegram-bot:latest`
 
-We use [Poetry](https://python-poetry.org/) as the package manager for this project. You can install it by the following command.
+To build locally instead, uncomment the `build:` blocks and comment out the `image:` lines in `docker-compose.yml`.
 
-```bash
-pip install poetry
-```
+### Local Development
 
-Then, install the dependencies.
+Requires Python 3.12 and [uv](https://docs.astral.sh/uv/).
 
 ```bash
-poetry install
-```
+# Install all dependencies (including dev)
+uv sync
 
-Finally, run the server.
+# Run the API server
+cd apps/api
+uv run gunicorn -k uvicorn.workers.UvicornWorker src.main:app --preload
 
-```bash
-poetry run gunicorn -k uvicorn.workers.UvicornWorker app.main:app --preload
+# Run the Telegram Bot (in a separate terminal)
+cd apps/telegram-bot
+uv run python -m core.main
 ```
 
-## Environment Variables
+### Telegram Bot Modes
+
+The bot supports two modes, controlled by the `TELEGRAM_BOT_MODE` environment variable:
 
-Note: Many of the services requires cookies to fetch content. You can get your cookies by browser extension [Get cookies.txt LOCALLY](https://chrome.google.com/webstore/detail/get-cookiestxt-locally/cclelndahbckbenkjhflpdbgdldlbecc) and set the cookies as environment variables.
+| Mode | Value | Use Case |
+|------|-------|----------|
+| **Long Polling** | `polling` (default) | Local development, simple deployments without a reverse proxy |
+| **Webhook** | `webhook` | Production with a public HTTPS URL |
 
+In both modes, the bot runs an HTTP server on port 10451 for the `/send_message` callback endpoint (used by Inoreader integration) and `/health`.
 
-### Required Variables
+## Development
 
-- `BASE_URL`: The base url of the server. example: `example.com`
-- `TELEGRAM_BOT_TOKEN`: The token of the telegram bot.
-- `TELEGRAM_CHAT_ID`: The chat id of the telegram bot.
+### Commands
 
-### Optional Variables
+```bash
+uv sync                    # Install all dependencies
+uv run pytest              # Run tests
+uv run pytest -v           # Run tests with verbose output
+uv run black .             # Format code
+```
 
-#### FastAPI
+### Adding a New Platform Scraper
 
-- `PORT`: Default: `10450`
-- `API_KEY`: The api key for the FastAPI server. It would be generated automatically if not set.
+1. Create a new scraper module in `apps/api/src/services/scrapers/<platform>/`
+2. Implement the scraper class following existing patterns
+3. Add a platform-specific router in `apps/api/src/routers/`
+4. Register the scraper in `ScraperManager`
+5. Add configuration variables in `apps/api/src/config.py`
+6. Create tests in `tests/cases/`
 
-#### Telegram
+### Docker Build
 
-- `TELEBOT_API_SERVER_HOST`: The host of the telegram bot api server. Default: `telegram-bot-api`
-- `TELEBOT_API_SERVER_PORT`: The port of the telegram bot api server. Default: `8081`
-- `TELEGRAM_CHANNEL_ID`: The channel id of the telegram bot. Default: `None`
-- `TELEGRAM_CHANNEL_ADMIN_LIST`: The id list of the users who can send message to targeted telegram channel, divided by `,`. You cannot send message to the channel if you are not in the list. Default: `None`
+```bash
+# Build both services locally
+docker-compose build
 
-#### Twitter
+# Or build individually
+docker build -f apps/api/Dockerfile -t fastfetchbot-api .
+docker build -f apps/telegram-bot/Dockerfile -t fastfetchbot-telegram-bot .
+```
 
-Must set cookies variables if you want to fetch twitter content.
+> **Note:** Both Dockerfiles use the repository root as the build context (`.`) because they need access to `pyproject.toml`, `uv.lock`, and `packages/shared/`.
 
-- `TWITTER_CT0`: The ct0 cookie of twitter. Default: `None`
-- `TWITTER_AUTH_TOKEN`: The auth token of twitter. Default: `None`
+## Environment Variables
 
-#### Reddit
+Many scrapers require authentication cookies. You can extract cookies using the browser extension [Get cookies.txt LOCALLY](https://chrome.google.com/webstore/detail/get-cookiestxt-locally/cclelndahbckbenkjhflpdbgdldlbecc).
 
-We use `read_only` mode of `praw` to fetch reddit content. We still need to set the `client_id` , `client_secret` , `username` and `password` of your reddit api account.
+See `template.env` for a complete reference with comments.
 
-- `REDDIT_CLIENT_ID`: The client id of reddit. Default: `None`
-- `REDDIT_CLIENT_SECRET`: The client secret of reddit. Default: `None`
-- `REDDIT_USERNAME`: The username of reddit. Default: `None`
-- `REDDIT_PASSWORD`: The password of reddit. Default: `None`
+### Required
 
-#### Weibo
+| Variable | Description |
+|----------|-------------|
+| `BASE_URL` | Public domain of the server (e.g. `example.com`). Used for webhook URL construction. |
+| `TELEGRAM_BOT_TOKEN` | Bot token from [@BotFather](https://t.me/BotFather) |
+| `TELEGRAM_CHAT_ID` | Default chat ID for the bot |
 
-- `WEIBO_COOKIES`: The cookie of weibo. For some unknown reasons, some weibo posts may be not accessible if you don't are not logged in. Just copy the cookie from your browser and set it. Default: `None`
+### Service Communication (Docker)
 
-#### Xiaohongshu
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `API_SERVER_URL` | `http://localhost:10450` | URL the Telegram Bot uses to call the API server. Set to `http://api:10450` in Docker. |
+| `TELEGRAM_BOT_CALLBACK_URL` | `http://localhost:10451` | URL the API server uses to call the Telegram Bot. Set to `http://telegram-bot:10451` in Docker. |
+| `TELEGRAM_BOT_MODE` | `polling` | `polling` or `webhook` |
 
-- `XIAOHONGSHU_A1`: The a1 cookie of xiaohongshu. Default: `None`
-- `XIAOHONGSHU_WEBID`: The webid cookie of xiaohongshu. Default: `None`
-- `XIAOHONGSHU_WEBSESSION`: The websession cookie of xiaohongshu. Default: `None`
-#### OpenAI
+### Optional
 
-You can set the api key of OpenAI to use the transcription function.
+#### API Server
 
-- `OPENAI_API_KEY`: The api key of OpenAI. Default: `None`
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `PORT` | `10450` | API server port |
+| `API_KEY` | auto-generated | API key for authentication |
 
-#### Amazon S3 Picture Storage
+#### Telegram
 
-- `AWS_ACCESS_KEY_ID`: The access key id of Amazon S3. Default: `None`
-- `AWS_SECRET_ACCESS_KEY`: The secret access key of Amazon S3. Default: `None`
-- `AWS_S3_BUCKET_NAME`: The bucket name of Amazon S3. Default: `None`
-- `AWS_S3_REGION_NAME`: The region name of Amazon S3. Default: `None`
-- `AWS_DOMAIN_HOST`: The domain bound to the bucket. The picture upload function would generate images url by bucket name if customized host not set. Default: `None`
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `TELEBOT_API_SERVER_HOST` | `None` | Local Telegram Bot API server host |
+| `TELEBOT_API_SERVER_PORT` | `None` | Local Telegram Bot API server port |
+| `TELEGRAM_CHANNEL_ID` | `None` | Channel ID(s) for the bot, comma-separated |
+| `TELEGRAM_CHANNEL_ADMIN_LIST` | `None` | User IDs allowed to post to the channel, comma-separated |
+
+#### Platform Cookies & Credentials
+
+| Platform | Variables |
+|----------|-----------|
+| Twitter | `TWITTER_CT0`, `TWITTER_AUTH_TOKEN` |
+| Reddit | `REDDIT_CLIENT_ID`, `REDDIT_CLIENT_SECRET`, `REDDIT_USERNAME`, `REDDIT_PASSWORD` |
+| Weibo | `WEIBO_COOKIES` |
+| Xiaohongshu | `XIAOHONGSHU_A1`, `XIAOHONGSHU_WEBID`, `XIAOHONGSHU_WEBSESSION` |
+| Instagram | `X_RAPIDAPI_KEY` |
+| Zhihu | Store cookies in `conf/zhihu_cookies.json` |
+
+#### Cloud Services
+
+| Variable | Description |
+|----------|-------------|
+| `OPENAI_API_KEY` | OpenAI API key for audio transcription |
+| `AWS_ACCESS_KEY_ID` | Amazon S3 access key |
+| `AWS_SECRET_ACCESS_KEY` | Amazon S3 secret key |
+| `AWS_S3_BUCKET_NAME` | S3 bucket name |
+| `AWS_S3_REGION_NAME` | S3 region |
+| `AWS_DOMAIN_HOST` | Custom domain bound to the S3 bucket |
+
+#### General Webpage Scraping
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `GENERAL_SCRAPING_ON` | `false` | Enable scraping for unrecognized URLs |
+| `GENERAL_SCRAPING_API` | `FIRECRAWL` | Backend: `FIRECRAWL` or `ZYTE` |
+| `FIRECRAWL_API_URL` | | Firecrawl API server URL |
+| `FIRECRAWL_API_KEY` | | Firecrawl API key |
+| `ZYTE_API_KEY` | | Zyte API key |
 
 ## Supported Content Types
 
-### Social Media Content
+### Social Media
 
 - [x] Twitter
 - [x] Bluesky (Beta, only supports part of posts)
 - [x] Instagram
-- [ ] Threads 
+- [ ] Threads
 - [x] Reddit (Beta, only supports part of posts)
 - [ ] Quora
 - [x] Weibo
@@ -132,11 +195,18 @@ You can set the api key of OpenAI to use the transcription function.
 - [x] Douban
 - [ ] Xiaohongshu
 
-### Video Content
+### Video
 
-- [x] Youtube
+- [x] YouTube
 - [x] Bilibili
 
+## CI/CD
+
+The GitHub Actions pipeline (`.github/workflows/ci.yml`) automatically builds and pushes both microservice images to GitHub Container Registry on every push to `main`:
+
+- `ghcr.io/aturret/fastfetchbot-api:latest`
+- `ghcr.io/aturret/fastfetchbot-telegram-bot:latest`
+
 ## Acknowledgements
 
 The HTML to Telegra.ph converter function is based on [html-telegraph-poster](https://github.com/mercuree/html-telegraph-poster). I separated it from this project as an independent Python package: [html-telegraph-poster-v2](https://github.com/aturret/html-telegraph-poster-v2).