Skip to content

vtvz/medpack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

69 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

MedPack

Rust License

πŸ“’ Subscribe to the author's telegram channel for updates and more projects: @vtvz_dev

A powerful Rust-based tool for processing and organizing medical documents from Telegram chat exports into structured PDF documents with automatic OCR, metadata extraction, and comprehensive table of contents generation.

🎯 Overview

MedPack transforms structured Telegram chat exports containing medical records into beautifully organized PDF documents. It intelligently processes images, PDFs, and text messages, groups them by person, and creates professional medical document collections with proper pagination, OCR processing, and detailed table of contents.

Key Features

  • πŸ“± Multi-format Processing: Handles images (PNG, JPG), PDFs, and text messages from Telegram exports
  • πŸ” OCR Integration: Automatic OCR processing for images using ocrmypdf with Russian and English language support
  • πŸ“‹ Metadata Extraction: Parses YAML metadata blocks from messages to extract structured medical record information
  • πŸ‘₯ Smart Organization: Groups messages by person and creates separate PDF documents for each individual
  • πŸ“š Table of Contents: Generates detailed TOC with page numbers, dates, tags, and clickable Telegram message links
  • ⚑ Parallel Processing: Multi-threaded processing with real-time progress bars for efficient handling of large datasets
  • 🏷️ Document Labeling: Adds professional headers, footers, and page numbers to all documents
  • πŸ› οΈ Flexible Configuration: Optional OCR processing, temporary file preservation for debugging
  • πŸ”— Telegram Integration: Preserves links to original messages for easy reference

πŸ“Ί Live Example

Want to see MedPack in action? Check out our live example Telegram group:

πŸ”— MedPack Example Group

This group contains:

  • πŸ“± Real medical record messages with proper YAML metadata formatting
  • πŸ–ΌοΈ Sample images and PDFs showing the input format MedPack expects
  • πŸ“„ Processing results - the final generated PDF documents
  • πŸ’‘ Best practices for structuring your medical records in Telegram

The group demonstrates exactly how to format your Telegram messages for optimal MedPack processing, including proper YAML metadata blocks, image attachments, and text formatting. You can use this as a reference when preparing your own medical record exports.

πŸš€ Quick Start

The easiest way to run MedPack is using Docker. All prerequisites are preinstalled in the image.

docker run --rm -v "$(pwd):$(pwd)" -w "$(pwd)" -u "$(id -u):$(id -g)" -it --pull always ghcr.io/vtvz/medpack:latest

Building from Source

Prerequisites

Before using MedPack, ensure you have all the required external tools installed. The complete list of required tools can be found in the src/command.rs file.

Building MedPack

  1. Clone the repository:
git clone <repository-url>
cd medpack
  1. Build the project:
cargo build --release

The binary will be available at target/release/medpack.

Installing MedPack

Alternatively, you can install MedPack directly to your system using Cargo:

cargo install --path .

This will install the medpack binary to your Cargo bin directory (usually ~/.cargo/bin/), making it available system-wide.

πŸ“– Usage

Basic Usage

medpack [OPTIONS] [SOURCES...]

Command Line Options

For a complete list of available options and their descriptions, run:

medpack --help

Examples

Process current directory:

medpack

Process specific directories without OCR:

medpack --no-ocr /path/to/export1 /path/to/export2

Debug mode with temporary file preservation:

medpack --preserve-tmp --no-ocr ./telegram_export

Process multiple exports simultaneously:

medpack ~/Downloads/ChatExport_2023 ~/Downloads/ChatExport_2024

πŸ’‘ Tip: When processing multiple exports, MedPack will merge them together. This allows you to process only new days in the future instead of re-exporting the entire chat history - simply export the new messages and process them alongside your existing exports.

πŸ“ Note: When merging exports that contain the same messages (including edited versions), MedPack automatically uses the latest edited version of each message. This ensures that any corrections or updates made to medical records in Telegram are properly reflected in the final PDF output.

πŸ“ Input Format

Telegram Export Structure

MedPack expects Telegram chat exports in JSON format with the following structure:

telegram_export/
β”œβ”€β”€ result.json          # Main export file with message data
β”œβ”€β”€ photos/             # Directory containing image files
β”‚   β”œβ”€β”€ photo_1.jpg
β”‚   └── photo_2.png
└── files/              # Directory containing PDF attachments
    └── document.pdf

Message Types Processed

⚠️ Important: Only messages containing YAML metadata blocks are processed. All other messages, images, and files without YAML blocks are ignored.

  1. πŸ“ Messages with YAML metadata blocks - Define medical records with structured information
  2. πŸ“· Image messages - Photos in PNG or JPEG format (both compressed regular photos and uncompressed file attachments) that can be processed with OCR
  3. πŸ“„ PDF attachments - Direct PDF files from messages
  4. πŸ’¬ Text messages - Converted to PDF format

YAML Metadata Format

Messages MUST contain YAML blocks with medical record metadata:

date: 2023.12.22
person: John Doe
tags:
  - cardiology
  - checkup
  - ECG
place: City Hospital
doctor: Dr. Smith

πŸ“ Text Record Formatting

For text-only records (messages without images or PDF files), you can use special code blocks to enhance the content:

HTML Code Blocks - Insert raw HTML directly into the generated PDF

CSV Code Blocks - Create tables from CSV data, where the first row is treated as the header

Hidden Code Blocks - Add personal notes that won't appear in the final PDF

Telegram Formatting - All Telegram message formatting is preserved

Example Text Record:
```yaml
date: 2023.12.22
person: John Doe
tags:
  - consultation
  - notes
```

Patient reported feeling better after treatment.

```html
<div class="alert alert-info">
  <strong>Important:</strong> Patient has allergies to penicillin and sulfa drugs.
</div>
```

```csv
Medication,Dosage,Frequency,Duration
Aspirin,100mg,Daily,30 days
Lisinopril,10mg,Daily,Ongoing
Metformin,500mg,Twice daily,90 days
```

Follow-up appointment scheduled for next month.

```hidden
Remember to follow up on blood test results next week.
Patient seemed anxious - consider referral to counselor.
```

⚠️ Important YAML Block Requirements

  • πŸ“ Position: The YAML block must be at the very beginning of the message text
  • πŸ“· Multiple Images: If a medical record consists of multiple images, the YAML block should be placed under the first image in the sequence
  • πŸ–ΌοΈ Image Format: Images must be in PNG or JPEG format (both compressed regular photos and uncompressed file attachments) for proper OCR processing
  • πŸ’» Formatting: The YAML block must be formatted as code within the Telegram message, not as plain text

Supported YAML Fields

Field Type Description Required
date String Date of the medical record (YYYY.MM.DD) βœ…
person String Name of the person the record belongs to βœ…
tags Array List of tags/categories for the record βœ…
place String Medical facility or location ❌
doctor String Doctor's name ❌

🏷️ HTML Tags Support

Tags now support HTML formatting for enhanced visual presentation in the generated PDFs. This is particularly useful for highlighting important issues or categorizing records with visual emphasis.

Examples:

tags:
  - cardiology
  - <b>urgent</b>
  - <i>follow-up required</i>
  - ECG
  - <b style="color: red;">critical</b>

πŸ“€ Output

Generated Files

For each person found in the chat export, MedPack generates:

  1. PersonName.pdf - Complete medical document collection
  2. Table of Contents - At the beginning of each PDF containing:
    • Record dates and tags
    • Page numbers with proper pagination
    • Clickable links to original Telegram messages
    • Doctor and location information
    • Professional formatting with Bootstrap CSS

Document Features

  • πŸ“„ Professional Layout: Clean, medical-grade document formatting
  • πŸ”’ Page Numbers: Consistent pagination throughout the document
  • 🏷️ Headers & Footers: Record metadata displayed in document headers
  • πŸ”— Telegram Links: Direct links to original messages for verification
  • πŸ“Š Progress Tracking: Real-time progress bars during processing
  • 🎨 Responsive Design: Bootstrap-based HTML rendering for PDFs

πŸ› Troubleshooting

Common Issues

Missing External Tools

# Error: command not found
medpack: error: `img2pdf` not found in PATH

Solution: Install missing prerequisites using your package manager.

OCR Processing Slow

# Use --no-ocr flag to completely disable OCR processing
medpack --no-ocr

Note: The --no-ocr flag completely disables OCR processing for images, which significantly speeds up processing but means that text within images will not be extracted or searchable in the final PDF.

Debug Mode

Enable debug mode to inspect temporary files:

medpack --preserve-tmp

This will output paths to temporary directories:

tmp folders: /tmp/medpack_html_xyz /tmp/medpack_img_xyz /tmp/medpack_label_xyz

πŸ“„ License

This project is licensed under the MIT License. See the LICENSE file for details.


πŸ“§ Support: For issues and questions, please use the GitHub issue tracker.

πŸ”„ Updates: Check releases for the latest features and bug fixes.

πŸ’¬ Personal Support: If you have any questions or need help, feel free to reach out to me personally on Telegram: @vtvz_me

About

Make medical card using telegram export

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages