Skip to content

Conversation

@iulusoy
Copy link
Member

@iulusoy iulusoy commented Oct 17, 2025

DimasfromLavoisier and others added 22 commits September 12, 2025 16:10
add new dependencies for upcoming models
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.12.10 → v0.13.0](astral-sh/ruff-pre-commit@v0.12.10...v0.13.0)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.13.0 → v0.13.1](astral-sh/ruff-pre-commit@v0.13.0...v0.13.1)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.13.1 → v0.13.3](astral-sh/ruff-pre-commit@v0.13.1...v0.13.3)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@iulusoy iulusoy requested a review from Copilot October 22, 2025 10:26
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds multimodal image summary and visual question answering (VQA) capabilities to the ammico package using the Qwen2.5-VL model. The changes address dependency compatibility issues and remove the redundant analyse_text parameter from TextDetector.

  • Adds new MultimodalSummaryModel and ImageSummaryDetector classes for image captioning and VQA
  • Removes the analyse_text boolean parameter from TextDetector (always enabled now)
  • Updates dependencies to resolve compatibility issues with CUDA 11.8 and JupyterLab

Reviewed Changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
pyproject.toml Added ML dependencies (transformers, torch, etc.) and adjusted tensorflow/googletrans versions for compatibility
environment.yml New conda environment specification with CUDA 11.8 and PyTorch 2.3.1
ammico/utils.py Added AnalysisType enum for categorizing analysis types
ammico/text.py Removed analyse_text parameter and its validation logic
ammico/test/test_text.py Updated tests to remove analyse_text parameter usage
ammico/test/test_model.py New tests for MultimodalSummaryModel initialization and resource management
ammico/test/test_image_summary.py New tests for image summary and VQA functionality
ammico/test/test_display.py Updated display test to match new parameter structure
ammico/test/conftest.py Added model fixture for test suite
ammico/notebooks/DemoNotebook_ammico.ipynb Updated notebook to use new API without analyse_text parameter
ammico/notebooks/DemoImageSummaryVQA.ipynb New demo notebook showcasing image summary and VQA features
ammico/model.py New class implementing Qwen2.5-VL model loading and management
ammico/image_summary.py New class for image captioning and visual question answering
ammico/display.py Added VQA detector option and UI components for questions
ammico/init.py Exported new MultimodalSummaryModel and ImageSummaryDetector classes
.pre-commit-config.yaml Updated ruff version
.github/workflows/ci.yml Excluded long-running tests from CI
Comments suppressed due to low confidence (2)

ammico/notebooks/DemoNotebook_ammico.ipynb:1

  • Hardcoded absolute path to credentials file exposes user-specific filesystem structure and should be parameterized or use relative paths. Consider using environment variables or a configuration file.
{

ammico/notebooks/DemoNotebook_ammico.ipynb:1

  • The variable data_path is defined but appears to be unused in the subsequent code, which still references the hardcoded path in the ammico.find_files() call. Either remove the unused variable or update the function call to use it.
{

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@iulusoy
Copy link
Member Author

iulusoy commented Oct 23, 2025

This PR looks good to me. I think I will make some small changes to the structure of the class and helper methods, if that is ok, in the summary module.

@iulusoy iulusoy requested a review from Copilot October 27, 2025 08:50
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 8 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

1. Feature extraction from the images: User inputs query and images are matched to that query (both text and image query)
1. Question answering
1. Question answering about image content
1. Content extractioni from the videos
Copy link

Copilot AI Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'extractioni' to 'extraction'.

Suggested change
1. Content extractioni from the videos
1. Content extraction from the videos

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
iulusoy and others added 2 commits October 27, 2025 10:01
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@iulusoy
Copy link
Member Author

iulusoy commented Oct 27, 2025

@DimasfromLavoisier could you please check that my changes did not break anything? I'm still having issues with running some of the tests/analysis on my laptop (did not have time to set up compatible cuda version for torch and tensorflow).

@sonarqubecloud
Copy link

@DimasfromLavoisier DimasfromLavoisier merged commit 19f33c3 into main Oct 29, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ipykernel 7.0.0 cannot send display data from threads

3 participants