Skip to content

Add PaddleOCR-VL multimodal model support with Docker dependencies#2

Draft
Copilot wants to merge 13 commits intomasterfrom
copilot/add-paddleocr-vl-support
Draft

Add PaddleOCR-VL multimodal model support with Docker dependencies#2
Copilot wants to merge 13 commits intomasterfrom
copilot/add-paddleocr-vl-support

Conversation

Copy link
Copy Markdown

Copilot AI commented Feb 26, 2026

PaddleOCR-VL models require paddlex[ocr] dependencies that weren't in the Docker build, causing runtime DependencyError when attempting to use VL models via API parameters.

Changes

Dependency Management

  • Added paddlex[ocr]>=3.4.0 to requirements.in and requirements.txt
  • Ensures VL models work in Docker deployments without additional installation steps

Docker Configuration

  • Created /root/.paddlex directory in Dockerfile for VL model cache
  • Added paddlex_models persistent volume in docker-compose.yml
  • Increased memory limit to 10GB (VL models require 4-6GB vs 2-3GB for traditional models)
  • Added optional PADDLE_PDX_DISABLE_MODEL_SOURCE_CHECK environment variable for faster startup

Log Cleanup

  • Suppressed expected PaddlePaddle/PaddleOCR library warnings in main.py using module-specific filters
  • Warnings about ignored parameters, missing ccache, PyTorch API compatibility are informational only

Documentation

  • Added Docker deployment section to PADDLEOCR_VL_GUIDE.md with volume management and troubleshooting
  • Updated README.md to position Docker as recommended deployment method for VL models
  • Clarified that warnings in logs are expected and harmless

Usage

# docker-compose.yml
volumes:
  - paddleocr_models:/root/.paddleocr    # Traditional models
  - paddlex_models:/root/.paddlex        # VL models
docker-compose up -d
# API call with VL model
GET /ocr/predict-by-path?image_path=doc.jpg&detection_model=PaddleOCR-VL-1.5

Backward Compatibility

Traditional PP-OCR models unaffected. VL models purely additive via optional query parameters.

Original prompt

Implemente as seguintes melhorias:

  1. Suporte opcional ao PaddleOCR-VL-1.5 e PaddleOCR-VL:

    • Permita que os endpoints atuais aceitem, opcionalmente, o uso dos modelos "PaddleOCR-VL-1.5" ou "PaddleOCR-VL" através de detection_model e/ou recognition_model.
    • Se o usuário passar um desses nomes de modelo (ex: detection_model='PaddleOCR-VL-1.5'), o endpoint deve executar a inferência usando a interface nativa do PaddleOCR-VL (ex: PaddleOCRVL), aproveitando as capacidades do modelo VL (incluindo layout, tabelas, gráficos, selos, multilinguagem, etc.).
    • Se os parâmetros detection_model/recognition_model não forem passados ou receberem valores tradicionais (PP-OCR*), o funcionamento permanece inalterado.
    • Não remova nem modifique a lógica dos modelos existentes; apenas adicione o caminho alternativo para PaddleOCR-VL.
  2. Documentação:

    • Atualize a documentação dos endpoints para listar explicitamente os valores válidos para detection_model e recognition_model, incluindo as novas opções PaddleOCR-VL-1.5 e PaddleOCR-VL, além das já suportadas (ex: PP-OCRv5_server_det, PP-OCRv5_server_rec, etc.).
    • Inclua descrição sucinta das características-chave dos modelos VL (suporte a tabelas, fórmulas, selos, 111 idiomas, etc).
  3. Testes:

    • Adicione testes automáticos básicos (ou exemplos de uso em docstring) que comprovem o uso dos modelos VL dentro dos endpoints existentes.

Exemplo de parâmetros válidos:
detection_model: [PP-OCRv5_server_det, PP-OCRv5_mobile_det, PP-OCRv4_server_det, PP-OCRv4_mobile_det, PaddleOCR-VL-1.5, PaddleOCR-VL]
recognition_model: [PP-OCRv5_server_rec, PP-OCRv5_mobile_rec, PP-OCRv4_server_rec, PP-OCRv4_mobile_rec, PaddleOCR-VL-1.5, PaddleOCR-VL]

Importante:

  • Não altere o fluxo padrão das rotas já existentes para quem não usa os modelos VL.
  • Toda lógica de fallback/caminho alternativo deve ser adicionada de forma segura e inofensiva para compatibilidade retroativa.
  • Siga o padrão e arquitetura já existente no projeto.

This pull request was created from Copilot chat.


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 4 commits February 26, 2026 18:57
Co-authored-by: tpraxedes <53882938+tpraxedes@users.noreply.github.com>
Co-authored-by: tpraxedes <53882938+tpraxedes@users.noreply.github.com>
Co-authored-by: tpraxedes <53882938+tpraxedes@users.noreply.github.com>
Co-authored-by: tpraxedes <53882938+tpraxedes@users.noreply.github.com>
Copilot AI changed the title [WIP] Add optional support for PaddleOCR-VL models Add optional PaddleOCR-VL multimodal model support to all endpoints Feb 26, 2026
@tpraxedes
Copy link
Copy Markdown

@copilot verifique para corrigir os erros:

2026-02-26T19:04:57.037Z �[33mChecking connectivity to the model hosters, this may take a while. To bypass this check, set PADDLE_PDX_DISABLE_MODEL_SOURCE_CHECK to True.�[0m
2026-02-26T19:05:09.230Z /app/routers/ocr.py:111: UserWarning: lang and ocr_version will be ignored when model names or model directories are not None.
2026-02-26T19:05:09.231Z ocr_instance = PaddleOCR(
2026-02-26T19:05:09.802Z /usr/local/lib/python3.9/site-packages/paddle/utils/cpp_extension/extension_utils.py:718: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
2026-02-26T19:05:09.802Z warnings.warn(warning_message)
2026-02-26T19:05:10.268Z �[32mCreating model: ('PP-LCNet_x1_0_textline_ori', None)�[0m
2026-02-26T19:05:10.269Z �[32mUsing official model (PP-LCNet_x1_0_textline_ori), the model files will be automatically downloaded and saved in /root/.paddlex/official_models/PP-LCNet_x1_0_textline_ori.�[0m
2026-02-26T19:05:14.109Z Fetching 6 files: 0%| | 0/6 [00:00<?, ?it/s]
Fetching 6 files: 17%|█▋ | 1/6 [00:00<00:00, 6.37it/s]
Fetching 6 files: 83%|████████▎ | 5/6 [00:03<00:00, 1.34it/s]
Fetching 6 files: 100%|██████████| 6/6 [00:03<00:00, 1.69it/s]
2026-02-26T19:05:14.508Z �[32mCreating model: ('PP-OCRv5_server_det', None)�[0m
2026-02-26T19:05:14.509Z �[32mUsing official model (PP-OCRv5_server_det), the model files will be automatically downloaded and saved in /root/.paddlex/official_models/PP-OCRv5_server_det.�[0m
2026-02-26T19:05:18.067Z Fetching 6 files: 0%| | 0/6 [00:00<?, ?it/s]
Fetching 6 files: 17%|█▋ | 1/6 [00:00<00:00, 5.66it/s]
Fetching 6 files: 83%|████████▎ | 5/6 [00:03<00:00, 1.40it/s]
Fetching 6 files: 100%|██████████| 6/6 [00:03<00:00, 1.76it/s]
2026-02-26T19:05:20.719Z �[32mCreating model: ('PP-OCRv5_server_rec', None)�[0m
2026-02-26T19:05:20.719Z �[32mUsing official model (PP-OCRv5_server_rec), the model files will be automatically downloaded and saved in /root/.paddlex/official_models/PP-OCRv5_server_rec.�[0m
2026-02-26T19:05:24.179Z Fetching 6 files: 0%| | 0/6 [00:00<?, ?it/s]
Fetching 6 files: 17%|█▋ | 1/6 [00:00<00:01, 4.74it/s]
Fetching 6 files: 83%|████████▎ | 5/6 [00:03<00:00, 1.45it/s]
Fetching 6 files: 100%|██████████| 6/6 [00:03<00:00, 1.81it/s]
2026-02-26T19:05:28.095Z INFO: Started server process [1]
2026-02-26T19:05:28.095Z INFO: Waiting for application startup.
2026-02-26T19:05:28.095Z INFO: Application startup complete.
2026-02-26T19:05:28.095Z INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
2026-02-26T19:05:31.274Z INFO: 127.0.0.1:57354 - "GET /docs HTTP/1.1" 200 OK
2026-02-26T19:05:34.633Z INFO: 172.16.1.76:55649 - "GET /docs HTTP/1.1" 200 OK
2026-02-26T19:05:34.717Z INFO: 172.16.1.76:55649 - "GET /openapi.json HTTP/1.1" 200 OK
2026-02-26T19:05:48.939Z INFO: 172.16.1.76:62063 - "POST /ocr/predict-by-file?detection_model=PaddleOCR-VL-1.5&recognition_model=PaddleOCR-VL-1.5 HTTP/1.1" 500 Internal Server Error
2026-02-26T19:05:48.942Z ERROR: Exception in ASGI application
2026-02-26T19:05:48.942Z Traceback (most recent call last):
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/paddleocr/_pipelines/base.py", line 105, in _create_paddlex_pipeline
2026-02-26T19:05:48.942Z return create_pipeline(config=self._merged_paddlex_config, **kwargs)
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/paddlex/inference/pipelines/init.py", line 168, in create_pipeline
2026-02-26T19:05:48.942Z pipeline = BasePipeline.get(pipeline_name)(
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/paddlex/utils/deps.py", line 207, in _wrapper
2026-02-26T19:05:48.942Z require_extra(extra, obj_name=pipeline_name, alt=alt)
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/paddlex/utils/deps.py", line 200, in require_extra
2026-02-26T19:05:48.942Z raise DependencyError(msg)
2026-02-26T19:05:48.942Z paddlex.utils.deps.DependencyError: PaddleOCR-VL-1.5 requires additional dependencies. To install them, run pip install "paddlex[ocr]==<PADDLEX_VERSION>" if you’re installing paddlex from an index, or pip install -e "/path/to/PaddleX[ocr]" if you’re installing paddlex locally.
2026-02-26T19:05:48.942Z The above exception was the direct cause of the following exception:
2026-02-26T19:05:48.942Z Traceback (most recent call last):
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", line 403, in run_asgi
2026-02-26T19:05:48.942Z result = await app( # type: ignore[func-returns-value]
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in call
2026-02-26T19:05:48.942Z return await self.app(scope, receive, send)
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/fastapi/applications.py", line 1054, in call
2026-02-26T19:05:48.942Z await super().call(scope, receive, send)
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/starlette/applications.py", line 112, in call
2026-02-26T19:05:48.942Z await self.middleware_stack(scope, receive, send)
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/starlette/middleware/errors.py", line 187, in call
2026-02-26T19:05:48.942Z raise exc
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/starlette/middleware/errors.py", line 165, in call
2026-02-26T19:05:48.942Z await self.app(scope, receive, _send)
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/starlette/middleware/cors.py", line 93, in call
2026-02-26T19:05:48.942Z await self.simple_response(scope, receive, send, request_headers=headers)
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/starlette/middleware/cors.py", line 144, in simple_response
2026-02-26T19:05:48.942Z await self.app(scope, receive, send)
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 62, in call
2026-02-26T19:05:48.942Z await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2026-02-26T19:05:48.942Z raise exc
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
2026-02-26T19:05:48.942Z await app(scope, receive, sender)
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 714, in call
2026-02-26T19:05:48.942Z await self.middleware_stack(scope, receive, send)
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 734, in app
2026-02-26T19:05:48.942Z await route.handle(scope, receive, send)
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 288, in handle
2026-02-26T19:05:48.942Z await self.app(scope, receive, send)
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 76, in app
2026-02-26T19:05:48.942Z await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2026-02-26T19:05:48.942Z raise exc
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
2026-02-26T19:05:48.942Z await app(scope, receive, sender)
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 73, in app
2026-02-26T19:05:48.942Z response = await f(request)
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/fastapi/routing.py", line 301, in app
2026-02-26T19:05:48.942Z raw_response = await run_endpoint_function(
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/fastapi/routing.py", line 212, in run_endpoint_function
2026-02-26T19:05:48.942Z return await dependant.call(**values)
2026-02-26T19:05:48.942Z File "/app/routers/ocr.py", line 332, in predict_by_file
2026-02-26T19:05:48.942Z ocr_instance = get_ocr_instance(detection_model, recognition_model)
2026-02-26T19:05:48.942Z File "/app/routers/ocr.py", line 79, in get_ocr_instance
2026-02-26T19:05:48.942Z ocr_instance = PaddleOCRVL(
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/paddleocr/_pipelines/paddleocr_vl.py", line 86, in init
2026-02-26T19:05:48.942Z super().init(**kwargs)
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/paddleocr/_pipelines/base.py", line 67, in init
2026-02-26T19:05:48.942Z self.paddlex_pipeline = self._create_paddlex_pipeline()
2026-02-26T19:05:48.942Z File "/usr/local/lib/python3.9/site-packages/paddleocr/_pipelines/base.py", line 107, in _create_paddlex_pipeline
2026-02-26T19:05:48.942Z raise RuntimeError(
2026-02-26T19:05:48.942Z RuntimeError: A dependency error occurred during pipeline creation. Please refer to the installation documentation to ensure all required dependencies are installed.

Co-authored-by: tpraxedes <53882938+tpraxedes@users.noreply.github.com>
Copy link
Copy Markdown
Author

Copilot AI commented Feb 26, 2026

@copilot verifique para corrigir os erros:

2026-02-26T19:04:57.037Z �[33mChecking connectivity to the model hosters, this may take a while. To bypass this check, set `PADDLE_PDX_D...

Fixed in commit c5e3593.

The error occurred because PaddleOCR-VL models require additional dependencies that weren't installed.

Changes made:

  1. Added proper error handling that returns HTTP 501 with clear installation instructions
  2. Updated documentation (PADDLEOCR_VL_GUIDE.md, README.md) with prominent warnings about the dependency requirement
  3. Added troubleshooting section for this specific error

To use VL models, install:

pip install 'paddlex[ocr]'

The API now returns a helpful error message instead of crashing:

{
  "detail": "PaddleOCR-VL requires additional dependencies. Install with: pip install 'paddlex[ocr]'"
}

Traditional PP-OCR models continue to work without any additional dependencies.

Co-authored-by: tpraxedes <53882938+tpraxedes@users.noreply.github.com>
@tpraxedes
Copy link
Copy Markdown

@copilot ajuste o projeto para resolver as dependências necessárias, como o Dockerfile e docker-compose

Co-authored-by: tpraxedes <53882938+tpraxedes@users.noreply.github.com>
Copy link
Copy Markdown
Author

Copilot AI commented Feb 26, 2026

@copilot ajuste o projeto para resolver as dependências necessárias, como o Dockerfile e docker-compose

Fixed in commit 9b85cb6.

Changes made:

  1. requirements.in / requirements.txt: Added paddlex[ocr] dependency so VL models work by default
  2. Dockerfile: Creates /root/.paddlex directory and includes VL dependencies in build
  3. docker-compose.yml:
    • Added paddlex_models volume for VL model persistence
    • Increased memory limit to 8GB (VL models need more resources)
    • Added optional PADDLE_PDX_DISABLE_MODEL_SOURCE_CHECK environment variable
  4. Documentation: Updated PADDLEOCR_VL_GUIDE.md and README.md with Docker deployment instructions

Usage:

docker-compose up -d

VL models now work out of the box in Docker. Models are cached in persistent volumes (paddleocr_models and paddlex_models) so they don't need to be re-downloaded after container restarts.

Copilot AI and others added 2 commits February 26, 2026 19:17
Co-authored-by: tpraxedes <53882938+tpraxedes@users.noreply.github.com>
Co-authored-by: tpraxedes <53882938+tpraxedes@users.noreply.github.com>
Copilot AI changed the title Add optional PaddleOCR-VL multimodal model support to all endpoints Add optional PaddleOCR-VL multimodal model support with complete Docker integration Feb 26, 2026
Copilot AI and others added 3 commits February 26, 2026 19:31
Co-authored-by: tpraxedes <53882938+tpraxedes@users.noreply.github.com>
Co-authored-by: tpraxedes <53882938+tpraxedes@users.noreply.github.com>
Co-authored-by: tpraxedes <53882938+tpraxedes@users.noreply.github.com>
Copilot AI changed the title Add optional PaddleOCR-VL multimodal model support with complete Docker integration Add PaddleOCR-VL multimodal model support with Docker dependencies Feb 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants