Skip to content

feat(flo-ai): native file support and add enum#278

Merged
vizsatiz merged 4 commits into
developfrom
fix/workflow-fn-kwargs
Apr 22, 2026
Merged

feat(flo-ai): native file support and add enum#278
vizsatiz merged 4 commits into
developfrom
fix/workflow-fn-kwargs

Conversation

@vishnurk6247
Copy link
Copy Markdown
Member

@vishnurk6247 vishnurk6247 commented Apr 22, 2026

Summary by CodeRabbit

Release Notes

  • New Features

    • Added enum field type support for schema definitions
    • Introduced opt-in performance profiling capability for monitoring execution
  • Improvements

    • Enhanced document handling with unified formatting across LLM providers
    • Optimized document processing with concurrent formatting and PDF rasterization
    • Refactored workflow and agent execution for improved efficiency
  • Dependencies

    • Updated document processing libraries for better PDF support

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 22, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This PR introduces profiling instrumentation, refactors agent and workflow execution to enable concurrent document formatting, adds support for an enum field type in the YAML parser, implements document formatting methods across multiple LLM providers with provider-specific behavior, updates the document processor for PDF rasterization, and refreshes core dependencies from pymupdf4llm to pymupdf.

Changes

Cohort / File(s) Summary
Profiling Infrastructure
flo_ai/flo_ai/utils/profiler.py, flo_ai/flo_ai/telemetry/instrumentation.py
New profiling utility module with opt-in global state, context-manager profilers (aprofile, profile), and decorator factories. Telemetry instrumentation updated to wrap LLM calls, streams, and agent executions with profiling blocks.
Agent & Workflow Execution
flo_ai/flo_ai/agent/base_agent.py, flo_ai/flo_ai/arium/arium.py
Split message-history and graph-execution methods into profiled wrappers delegating to _impl variants. BaseAgent now collects document-formatting tasks concurrently via asyncio.gather instead of sequential awaits. Arium refactored node dispatch into shared _dispatch_node_run method with profiling coverage.
LLM Document Formatting Methods
flo_ai/flo_ai/llm/base_llm.py, flo_ai/flo_ai/llm/anthropic_llm.py, flo_ai/flo_ai/llm/gemini_llm.py, flo_ai/flo_ai/llm/rootflo_llm.py
Added provider-specific format_document_in_message implementations. BaseLLM now rasterizes PDFs to image blocks by default using pymupdf. Anthropic produces native document blocks. Gemini converts to types.Part. RootFloLLM delegates to underlying provider. Per-instance caching via _formatted_cache added to each.
LLM Constructor Updates
flo_ai/flo_ai/llm/anthropic_llm.py, flo_ai/flo_ai/llm/openai_llm.py, flo_ai/flo_ai/llm/azure_openai_llm.py
Updated import statements to include DocumentMessageContent. Constructor calls to super().__init__ reformatted to explicit keyword arguments. Azure OpenAI docstring clarified that PDFs are rasterized as images.
YAML Parser Enum Support
flo_ai/flo_ai/formatter/yaml_format_parser.py
Added enum field type mapping to __create_enum_type. Introduced __extract_enum_value helper to normalize literal and enum entries (dict with value key or primitives). Updated field generation to append "Must be exactly one of" constraint for enum fields.
Data Models
flo_ai/flo_ai/models/agent.py, flo_ai/flo_ai/models/chat_message.py
ParserFieldModel expanded to support type='enum' with values accepting Union[LiteralValueModel, str, int, float]. Validation enforces values presence for enum/literal types. ImageMessageContent and DocumentMessageContent now initialize _formatted_cache dict in __post_init__.
Document Processing Refactoring
flo_ai/flo_ai/utils/document_processor.py
Removed pymupdf4llm integration; replaced with lightweight _extract_with_pymupdf for text extraction. TXTProcessor simplified to support only bytes/base64. Added type annotations for processor map and methods. PDF rasterization responsibility moved to LLM classes.
Configuration & External Module
flo_ai/pyproject.toml, wavefront/server/modules/tools_module/tools_module/utils/message_processor_fn.py
Version bumped to 1.1.4. Dependencies updated: removed pypdf and pymupdf4llm, added pymupdf (>=1.24.0,<2). Message processor payload now uses input_data['kwargs'] instead of full input_data dict.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • fix: add ty and fix types #173: Shares overlapping refactoring of agent/LLM codepaths, particularly the introduction of _get_message_history_impl in BaseAgent and concurrent document formatting via asyncio.gather.

Suggested reviewers

  • vizsatiz

Poem

🐰 Hop, hop! The agent learns to run docs in parallel,
While profilers trace each leap through forest digital.
Enums bloom, PDFs rasterize to images bright—
Concurrent dreams make async work a delight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.79% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'native file support and add enum' clearly reflects the major changes in the PR: document/file handling refactoring across LLM implementations and new enum field type support in the YAML parser and models.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/workflow-fn-kwargs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.


path = Path(log_file).resolve()
path.parent.mkdir(parents=True, exist_ok=True)
_file_handle = open(path, 'w', buffering=1, encoding='utf-8')


_enabled: bool = False
_log_path: Optional[Path] = None
path = Path(log_file).resolve()
path.parent.mkdir(parents=True, exist_ok=True)
_file_handle = open(path, 'w', buffering=1, encoding='utf-8')
_log_path = path
except Exception:
pass
_file_handle = None
_log_path = None
try:
_file_handle.flush()
_file_handle.close()
except Exception:
from flo_ai.utils.logger import logger # late import to avoid cycles

logger.info('profile | %s', line)
except Exception:
)
try:
_file_handle.flush()
except Exception:
enable_profiling(
_env_path, mirror_console=_truthy(os.environ.get('FLO_AI_PROFILE_CONSOLE'))
)
except Exception:
@vizsatiz vizsatiz merged commit a276106 into develop Apr 22, 2026
7 of 9 checks passed
@vizsatiz vizsatiz deleted the fix/workflow-fn-kwargs branch April 22, 2026 08:56
thomastomy5 pushed a commit that referenced this pull request Apr 27, 2026
* fix(floware): get input from kwargs

* fix(flo-ai): use native pdf capability of llms

* feat(flo-ai): add profiler

* feat(flo-ai): add enum support in parser
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants