Skip to content

feat(pdf-reader): add new skill for PDF text extraction fallback#61

Open
divitkashyap wants to merge 2 commits intoMiniMax-AI:mainfrom
divitkashyap:feat/pdf-reader-clean
Open

feat(pdf-reader): add new skill for PDF text extraction fallback#61
divitkashyap wants to merge 2 commits intoMiniMax-AI:mainfrom
divitkashyap:feat/pdf-reader-clean

Conversation

@divitkashyap
Copy link
Copy Markdown
Contributor

@divitkashyap divitkashyap commented Apr 4, 2026

Submitted by: https://github.com/divitkashyap

What

Added pdf-reader — a skill that provides automatic PDF text extraction fallback using command-line tools (pdftotext/poppler-utils) with optional installation and user confirmation.

Why

When user shares a PDF or asks to read/extract text from it, and the agent lacks native PDF capability, this skill provides a complete fallback workflow:

  1. Detect PDF file in user's message
  2. Check for available tools (pdftotext → pdfplumber → pymupdf)
  3. If no tool found, ask user permission to install
  4. Extract PDF text to temp file
  5. Continue with original user task

Complementary to minimax-pdf-read (PR #51)

This skill differs from minimax-pdf-read:

  • minimax-pdf-read: User explicitly asks to extract text from a PDF (active)
  • pdf-reader: Fallback when agent needs to process PDF but lacks capability
    Both can coexist — they serve different use cases.

Tool Priority

  1. pdftotext (poppler-utils) — Preferred, fastest, system-level
  2. pdfplumber (Python) — Fallback if poppler not available
  3. pymupdf (Python) — Alternative Python fallback

Platform Support

  • macOS: Homebrew (brew install poppler) or pip
  • Linux: apt-get/dnf install poppler-utils or pip
  • Windows: winget/chocolatey or pip

Validation

All 15 skills pass: python .claude/skills/pr-review/scripts/validate_skills.py ✅

Submitted by: https://github.com/divitkashyap

## What
Added  — a skill that automatically detects when an agent cannot read PDFs and provides text extraction using command-line tools with optional installation and user confirmation.

## Why
Many AI agents lack native PDF reading capability. When they encounter a PDF, they either:
- Fail to help the user
- Give generic responses about not being able to access PDF content

This skill intercepts that situation and provides a complete fallback workflow using standard tools (pdftotext, pdfplumber, pymupdf).

## How It Works
1. **Detection**: Monitors for agent statements like 'I cannot read PDFs', 'I don't have the ability to read PDFs', etc.
2. **Tool Detection**: Checks for available tools in priority order: pdftotext → pdfplumber → pymupdf
3. **Installation**: If no tool found, asks user permission with platform-specific install commands
4. **Extraction**: Extracts PDF text to /tmp/pdf_extracted.txt
5. **Continuation**: Reads extracted text and proceeds with original user task

## Tool Priority
1. **pdftotext** (poppler-utils) — Preferred, fastest, system-level tool
2. **pdfplumber** (Python) — Fallback if poppler not available
3. **pymupdf** (Python) — Alternative Python fallback

## Platform Support
- **macOS**: Homebrew (brew install poppler) or pip
- **Linux (Ubuntu/Debian)**: apt-get install poppler-utils or pip
- **Linux (Fedora/RHEL)**: dnf install poppler-utils or pip
- **Windows**: winget/chocolatey or pip

## Key Features
- Automatic detection of agent PDF limitation
- Multi-tool fallback strategy
- User confirmation before installation
- Platform-specific installation commands
- Layout preservation option (-layout flag)
- Page range extraction support (-f, -l flags)
- Error handling for encrypted/protected PDFs

## Example Triggers
- 'I cannot read PDFs'
- 'I don't have the ability to read PDFs'
- 'I can't access PDF content'
- 'PDF reading is not supported'

## Files
- skills/pdf-reader/SKILL.md — Complete skill with workflow
- README.md — Updated with new skill entry

## Validation
All 15 skills pass: python .claude/skills/pr-review/scripts/validate_skills.py ✅
@divitkashyap divitkashyap changed the title Feat/pdf reader clean feat(pdf-reader): add new skill for PDF text extraction fallback Apr 4, 2026
@divitkashyap divitkashyap force-pushed the feat/pdf-reader-clean branch from b50e700 to e827850 Compare April 6, 2026 00:06
Submitted by: https://github.com/divitkashyap

## What
Added  — a skill that provides automatic PDF text extraction fallback using command-line tools (pdftotext/poppler-utils) with optional installation and user confirmation.

## Why
When user shares a PDF or asks to read/extract text from it, and the agent lacks native PDF capability, this skill provides a complete fallback workflow:
1. Detect PDF file in user's message
2. Check for available tools (pdftotext → pdfplumber → pymupdf)
3. If no tool found, ask user permission to install
4. Extract PDF text to temp file
5. Continue with original user task

## Complementary to minimax-pdf-read (PR MiniMax-AI#51)
This skill differs from :
- minimax-pdf-read: User explicitly asks to extract text from a PDF (active)
- pdf-reader: Fallback when agent needs to process PDF but lacks capability

Both can coexist — they serve different use cases.

## Tool Priority
1. pdftotext (poppler-utils) — Preferred, fastest, system-level
2. pdfplumber (Python) — Fallback if poppler not available
3. pymupdf (Python) — Alternative Python fallback

## Platform Support
- macOS: Homebrew (brew install poppler) or pip
- Linux: apt-get/dnf install poppler-utils or pip
- Windows: winget/chocolatey or pip

## Validation
All 15 skills pass: python .claude/skills/pr-review/scripts/validate_skills.py ✅
@divitkashyap divitkashyap force-pushed the feat/pdf-reader-clean branch from e827850 to bc1b338 Compare April 6, 2026 11:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant