feat: add OCI Generative AI provider — basic text completion#4959
feat: add OCI Generative AI provider — basic text completion#4959fede-kamel wants to merge 8 commits intocrewAIInc:mainfrom
Conversation
|
@greysonlalonde Hey! Following up on your feedback from #4885 — I split the original PR into smaller, scoped pieces. This is the first one: just the basic text completion provider for OCI GenAI (no streaming, no tools, no structured output, no multimodal, no embeddings). I couldn't trim this one further — it's the minimal foundation (provider class + shared auth + registration) plus tests. The source code is ~600 lines and the rest is test fixtures/tests. Each follow-up PR will layer on one capability at a time. Also, per the community tools guide you shared — the OCI tools (InvokeAgent, KnowledgeBase, ObjectStorage) will go into a standalone PyPI package ( Would appreciate a review whenever you get a chance. Thanks! |
Add streaming text completion via OCI SSE events: - stream=True in call() routes to _stream_call_impl with chunk events - iter_stream() yields raw text chunks (sync generator) - astream() wraps iter_stream via thread+queue for async callers - _stream_chat_events holds client lock for full stream duration - SSE event parsing handles both string and mapping payloads Tested live against meta.llama-3.3-70b-instruct, cohere.command-r-plus-08-2024, google.gemini-2.5-flash, and openai.gpt-5.2-chat-latest. Depends on: crewAIInc#4959 Tracking issue: crewAIInc#4944
Add native function calling for generic and Cohere model families: - _format_tools converts CrewAI tool specs to OCI SDK format - _extract_tool_calls normalizes responses back to CrewAI shape - _handle_tool_calls executes tools and recurses until model finishes - Cohere tool message handling with trailing tool results - Tool choice control (auto/none/required/function) - Passthrough parameter filtering via SDK introspection - Streaming tool call accumulation from SSE fragments - supports_function_calling() returns True Tested live against meta.llama-3.3-70b-instruct with raw tool call return and recursive tool execution. Depends on: crewAIInc#4961 (streaming), crewAIInc#4959 (basic text) Tracking issue: crewAIInc#4944
Add response_model (Pydantic) support for structured output: - _build_response_format converts Pydantic schema to OCI JsonSchemaResponseFormat (generic) or CohereResponseJsonFormat - _parse_structured_response validates and returns typed models - response_model threaded through call, _call_impl, _stream_call_impl, and _handle_tool_calls for full coverage - Handles JSON in markdown fences via base class _validate_structured_output Tested live against meta.llama-3.3-70b-instruct and google.gemini-2.5-flash. Depends on: crewAIInc#4962 (tool calling), crewAIInc#4961 (streaming), crewAIInc#4959 (basic text) Tracking issue: crewAIInc#4944
Add multimodal content handling for generic model families: - vision.py: model lists, data URI helpers, image encoding utilities - _build_generic_content handles image_url, document_url, video_url, audio_url content types mapped to OCI SDK content objects - _message_has_multimodal_content detects non-text payloads - Cohere models reject multimodal with clear error message - supports_multimodal() returns True Depends on: crewAIInc#4963, crewAIInc#4962, crewAIInc#4961, crewAIInc#4959 Tracking issue: crewAIInc#4944
Add OCI embedding support integrated with CrewAI's RAG pipeline: - OCIEmbeddingFunction: ChromaDB-compatible embedding callable with batching, config serialization, image embedding support - OCIProvider: Pydantic-based provider with alias validation for env vars and config keys - Factory registration in embeddings/factory.py + types.py - Supports text and image embeddings, output dimensions, custom endpoints, all 4 OCI auth modes Tested live against cohere.embed-english-v3.0 with API_KEY auth. Depends on: crewAIInc#4964, crewAIInc#4963, crewAIInc#4962, crewAIInc#4961, crewAIInc#4959 Tracking issue: crewAIInc#4944
Replace asyncio.to_thread wrappers with true async I/O using aiohttp for acall() and astream(). The OCI SDK is sync-only, so we bypass it for HTTP and use its signer for request authentication directly. - oci_async.py: OCIAsyncClient with aiohttp, OCI request signing, native SSE parsing, connection pooling - acall(): true async chat completion (no thread pool) - astream(): true async SSE streaming (no thread+queue bridge) - Graceful fallback to asyncio.to_thread when aiohttp unavailable or client is mocked (unit tests) - aiohttp + certifi added to crewai[oci] optional deps Temporary measure until OCI SDK ships native async support. Tested live: acall, astream, and concurrent acall against meta.llama-3.3-70b-instruct with API_KEY auth. Depends on: crewAIInc#4966, crewAIInc#4964, crewAIInc#4963, crewAIInc#4962, crewAIInc#4961, crewAIInc#4959 Tracking issue: crewAIInc#4944
105d7dd to
e6b52b5
Compare
Add streaming text completion via OCI SSE events: - stream=True in call() routes to _stream_call_impl with chunk events - iter_stream() yields raw text chunks (sync generator) - astream() wraps iter_stream via thread+queue for async callers - _stream_chat_events holds client lock for full stream duration - SSE event parsing handles both string and mapping payloads Tested live against meta.llama-3.3-70b-instruct, cohere.command-r-plus-08-2024, google.gemini-2.5-flash, and openai.gpt-5.2-chat-latest. Depends on: crewAIInc#4959 Tracking issue: crewAIInc#4944
Add streaming text completion via OCI SSE events: - stream=True in call() routes to _stream_call_impl with chunk events - iter_stream() yields raw text chunks (sync generator) - astream() wraps iter_stream via thread+queue for async callers - _stream_chat_events holds client lock for full stream duration - SSE event parsing handles both string and mapping payloads Tested live against meta.llama-3.3-70b-instruct, cohere.command-r-plus-08-2024, google.gemini-2.5-flash, and openai.gpt-5.2-chat-latest. Depends on: crewAIInc#4959 Tracking issue: crewAIInc#4944
Add native function calling for generic and Cohere model families: - _format_tools converts CrewAI tool specs to OCI SDK format - _extract_tool_calls normalizes responses back to CrewAI shape - _handle_tool_calls executes tools and recurses until model finishes - Cohere tool message handling with trailing tool results - Tool choice control (auto/none/required/function) - Passthrough parameter filtering via SDK introspection - Streaming tool call accumulation from SSE fragments - supports_function_calling() returns True Tested live against meta.llama-3.3-70b-instruct with raw tool call return and recursive tool execution. Depends on: crewAIInc#4961 (streaming), crewAIInc#4959 (basic text) Tracking issue: crewAIInc#4944
Add streaming text completion via OCI SSE events: - stream=True in call() routes to _stream_call_impl with chunk events - iter_stream() yields raw text chunks (sync generator) - astream() wraps iter_stream via thread+queue for async callers - _stream_chat_events holds client lock for full stream duration - SSE event parsing handles both string and mapping payloads Tested live against meta.llama-3.3-70b-instruct, cohere.command-r-plus-08-2024, google.gemini-2.5-flash, and openai.gpt-5.2-chat-latest. Depends on: crewAIInc#4959 Tracking issue: crewAIInc#4944
Add native function calling for generic and Cohere model families: - _format_tools converts CrewAI tool specs to OCI SDK format - _extract_tool_calls normalizes responses back to CrewAI shape - _handle_tool_calls executes tools and recurses until model finishes - Cohere tool message handling with trailing tool results - Tool choice control (auto/none/required/function) - Passthrough parameter filtering via SDK introspection - Streaming tool call accumulation from SSE fragments - supports_function_calling() returns True Tested live against meta.llama-3.3-70b-instruct with raw tool call return and recursive tool execution. Depends on: crewAIInc#4961 (streaming), crewAIInc#4959 (basic text) Tracking issue: crewAIInc#4944
Add response_model (Pydantic) support for structured output: - _build_response_format converts Pydantic schema to OCI JsonSchemaResponseFormat (generic) or CohereResponseJsonFormat - _parse_structured_response validates and returns typed models - response_model threaded through call, _call_impl, _stream_call_impl, and _handle_tool_calls for full coverage - Handles JSON in markdown fences via base class _validate_structured_output Tested live against meta.llama-3.3-70b-instruct and google.gemini-2.5-flash. Depends on: crewAIInc#4962 (tool calling), crewAIInc#4961 (streaming), crewAIInc#4959 (basic text) Tracking issue: crewAIInc#4944
Add streaming text completion via OCI SSE events: - stream=True in call() routes to _stream_call_impl with chunk events - iter_stream() yields raw text chunks (sync generator) - astream() wraps iter_stream via thread+queue for async callers - _stream_chat_events holds client lock for full stream duration - SSE event parsing handles both string and mapping payloads Tested live against meta.llama-3.3-70b-instruct, cohere.command-r-plus-08-2024, google.gemini-2.5-flash, and openai.gpt-5.2-chat-latest. Depends on: crewAIInc#4959 Tracking issue: crewAIInc#4944
Add native function calling for generic and Cohere model families: - _format_tools converts CrewAI tool specs to OCI SDK format - _extract_tool_calls normalizes responses back to CrewAI shape - _handle_tool_calls executes tools and recurses until model finishes - Cohere tool message handling with trailing tool results - Tool choice control (auto/none/required/function) - Passthrough parameter filtering via SDK introspection - Streaming tool call accumulation from SSE fragments - supports_function_calling() returns True Tested live against meta.llama-3.3-70b-instruct with raw tool call return and recursive tool execution. Depends on: crewAIInc#4961 (streaming), crewAIInc#4959 (basic text) Tracking issue: crewAIInc#4944
Add response_model (Pydantic) support for structured output: - _build_response_format converts Pydantic schema to OCI JsonSchemaResponseFormat (generic) or CohereResponseJsonFormat - _parse_structured_response validates and returns typed models - response_model threaded through call, _call_impl, _stream_call_impl, and _handle_tool_calls for full coverage - Handles JSON in markdown fences via base class _validate_structured_output Tested live against meta.llama-3.3-70b-instruct and google.gemini-2.5-flash. Depends on: crewAIInc#4962 (tool calling), crewAIInc#4961 (streaming), crewAIInc#4959 (basic text) Tracking issue: crewAIInc#4944
Add multimodal content handling for generic model families: - vision.py: model lists, data URI helpers, image encoding utilities - _build_generic_content handles image_url, document_url, video_url, audio_url content types mapped to OCI SDK content objects - _message_has_multimodal_content detects non-text payloads - Cohere models reject multimodal with clear error message - supports_multimodal() returns True Depends on: crewAIInc#4963, crewAIInc#4962, crewAIInc#4961, crewAIInc#4959 Tracking issue: crewAIInc#4944
Add streaming text completion via OCI SSE events: - stream=True in call() routes to _stream_call_impl with chunk events - iter_stream() yields raw text chunks (sync generator) - astream() wraps iter_stream via thread+queue for async callers - _stream_chat_events holds client lock for full stream duration - SSE event parsing handles both string and mapping payloads Tested live against meta.llama-3.3-70b-instruct, cohere.command-r-plus-08-2024, google.gemini-2.5-flash, and openai.gpt-5.2-chat-latest. Depends on: crewAIInc#4959 Tracking issue: crewAIInc#4944
Add native function calling for generic and Cohere model families: - _format_tools converts CrewAI tool specs to OCI SDK format - _extract_tool_calls normalizes responses back to CrewAI shape - _handle_tool_calls executes tools and recurses until model finishes - Cohere tool message handling with trailing tool results - Tool choice control (auto/none/required/function) - Passthrough parameter filtering via SDK introspection - Streaming tool call accumulation from SSE fragments - supports_function_calling() returns True Tested live against meta.llama-3.3-70b-instruct with raw tool call return and recursive tool execution. Depends on: crewAIInc#4961 (streaming), crewAIInc#4959 (basic text) Tracking issue: crewAIInc#4944
Add response_model (Pydantic) support for structured output: - _build_response_format converts Pydantic schema to OCI JsonSchemaResponseFormat (generic) or CohereResponseJsonFormat - _parse_structured_response validates and returns typed models - response_model threaded through call, _call_impl, _stream_call_impl, and _handle_tool_calls for full coverage - Handles JSON in markdown fences via base class _validate_structured_output Tested live against meta.llama-3.3-70b-instruct and google.gemini-2.5-flash. Depends on: crewAIInc#4962 (tool calling), crewAIInc#4961 (streaming), crewAIInc#4959 (basic text) Tracking issue: crewAIInc#4944
Add multimodal content handling for generic model families: - vision.py: model lists, data URI helpers, image encoding utilities - _build_generic_content handles image_url, document_url, video_url, audio_url content types mapped to OCI SDK content objects - _message_has_multimodal_content detects non-text payloads - Cohere models reject multimodal with clear error message - supports_multimodal() returns True Depends on: crewAIInc#4963, crewAIInc#4962, crewAIInc#4961, crewAIInc#4959 Tracking issue: crewAIInc#4944
Add OCI embedding support integrated with CrewAI's RAG pipeline: - OCIEmbeddingFunction: ChromaDB-compatible embedding callable with batching, config serialization, image embedding support - OCIProvider: Pydantic-based provider with alias validation for env vars and config keys - Factory registration in embeddings/factory.py + types.py - Supports text and image embeddings, output dimensions, custom endpoints, all 4 OCI auth modes Tested live against cohere.embed-english-v3.0 with API_KEY auth. Depends on: crewAIInc#4964, crewAIInc#4963, crewAIInc#4962, crewAIInc#4961, crewAIInc#4959 Tracking issue: crewAIInc#4944
Add streaming text completion via OCI SSE events: - stream=True in call() routes to _stream_call_impl with chunk events - iter_stream() yields raw text chunks (sync generator) - astream() wraps iter_stream via thread+queue for async callers - _stream_chat_events holds client lock for full stream duration - SSE event parsing handles both string and mapping payloads Tested live against meta.llama-3.3-70b-instruct, cohere.command-r-plus-08-2024, google.gemini-2.5-flash, and openai.gpt-5.2-chat-latest. Depends on: crewAIInc#4959 Tracking issue: crewAIInc#4944
Add native function calling for generic and Cohere model families: - _format_tools converts CrewAI tool specs to OCI SDK format - _extract_tool_calls normalizes responses back to CrewAI shape - _handle_tool_calls executes tools and recurses until model finishes - Cohere tool message handling with trailing tool results - Tool choice control (auto/none/required/function) - Passthrough parameter filtering via SDK introspection - Streaming tool call accumulation from SSE fragments - supports_function_calling() returns True Tested live against meta.llama-3.3-70b-instruct with raw tool call return and recursive tool execution. Depends on: crewAIInc#4961 (streaming), crewAIInc#4959 (basic text) Tracking issue: crewAIInc#4944
Add response_model (Pydantic) support for structured output: - _build_response_format converts Pydantic schema to OCI JsonSchemaResponseFormat (generic) or CohereResponseJsonFormat - _parse_structured_response validates and returns typed models - response_model threaded through call, _call_impl, _stream_call_impl, and _handle_tool_calls for full coverage - Handles JSON in markdown fences via base class _validate_structured_output Tested live against meta.llama-3.3-70b-instruct and google.gemini-2.5-flash. Depends on: crewAIInc#4962 (tool calling), crewAIInc#4961 (streaming), crewAIInc#4959 (basic text) Tracking issue: crewAIInc#4944
Add multimodal content handling for generic model families: - vision.py: model lists, data URI helpers, image encoding utilities - _build_generic_content handles image_url, document_url, video_url, audio_url content types mapped to OCI SDK content objects - _message_has_multimodal_content detects non-text payloads - Cohere models reject multimodal with clear error message - supports_multimodal() returns True Depends on: crewAIInc#4963, crewAIInc#4962, crewAIInc#4961, crewAIInc#4959 Tracking issue: crewAIInc#4944
Add OCI embedding support integrated with CrewAI's RAG pipeline: - OCIEmbeddingFunction: ChromaDB-compatible embedding callable with batching, config serialization, image embedding support - OCIProvider: Pydantic-based provider with alias validation for env vars and config keys - Factory registration in embeddings/factory.py + types.py - Supports text and image embeddings, output dimensions, custom endpoints, all 4 OCI auth modes Tested live against cohere.embed-english-v3.0 with API_KEY auth. Depends on: crewAIInc#4964, crewAIInc#4963, crewAIInc#4962, crewAIInc#4961, crewAIInc#4959 Tracking issue: crewAIInc#4944
Replace asyncio.to_thread wrappers with true async I/O using aiohttp for acall() and astream(). The OCI SDK is sync-only, so we bypass it for HTTP and use its signer for request authentication directly. - oci_async.py: OCIAsyncClient with aiohttp, OCI request signing, native SSE parsing, connection pooling - acall(): true async chat completion (no thread pool) - astream(): true async SSE streaming (no thread+queue bridge) - Graceful fallback to asyncio.to_thread when aiohttp unavailable or client is mocked (unit tests) - aiohttp + certifi added to crewai[oci] optional deps Temporary measure until OCI SDK ships native async support. Tested live: acall, astream, and concurrent acall against meta.llama-3.3-70b-instruct with API_KEY auth. Depends on: crewAIInc#4966, crewAIInc#4964, crewAIInc#4963, crewAIInc#4962, crewAIInc#4961, crewAIInc#4959 Tracking issue: crewAIInc#4944
|
@greysonlalonde — following up on your feedback from #4885. I've split the original PR into scoped pieces per your guidance. The first one is ready for review. PR 1 of 7 — Basic OCI text completion (this PR) Covers only the foundation: provider class, shared auth utilities, routing registration, and tests. No streaming, tool calling, structured output, multimodal, or embeddings — all deferred to follow-up PRs. The Bugbot-flagged Full series (tracking issue: #4944):
Would appreciate a review whenever you get a chance. Thanks again for the guidance on scope! |
- OCICompletion(BaseLLM): sync call() and async acall() for generic (Meta, Google, OpenAI, xAI) and Cohere model families - Shared OCI auth utilities (utilities/oci.py): API key, security token, instance principal, and resource principal auth - Provider routing in llm.py: oci/ prefix and OCI model-id patterns - oci registered as optional dependency (crewai[oci]) - Configurable timeout via DEFAULT_OCI_TIMEOUT constant - Cohere and generic request/response paths fully separated Tracking issue: crewAIInc#4944 Part 1 of series: crewAIInc#4959 → crewAIInc#4961 → crewAIInc#4962 → crewAIInc#4963 → crewAIInc#4964 → crewAIInc#4966 → crewAIInc#4982
b904813 to
7561683
Compare
Live integration test resultsValidated against a live OCI GenAI account (
All three model families (Llama, Cohere, Gemini) confirmed working against the live API, both sync and async. |
|
Hi @greysonlalonde — just a gentle follow-up. I did everything you asked after #4885 — split the work into 7 scoped PRs, deferred streaming, tool calling, structured output, multimodal, and embeddings to separate PRs, and confirmed OCI tools will go into a standalone PyPI package. This first PR is the minimal foundation: provider class, shared auth, routing registration, and tests. It's been validated against a live OCI GenAI account across all three model families (Llama, Cohere, Gemini) — both sync and async. Would really appreciate a review whenever you get the chance. Happy to make any adjustments! |
Add native OCI Generative AI support to CrewAI with basic text completion for generic (Meta, Google, OpenAI, xAI) and Cohere model families. This is the first in a series of PRs to incrementally build out full OCI support (streaming, tool calling, structured output, embeddings, and multimodal in follow-up PRs). Tracking issue: crewAIInc#4944 Supersedes: crewAIInc#4885
Tool calling is not implemented in this PR. Returning True would cause CrewAI to choose the native tools path, silently dropping tools from agents. Flagged by Cursor Bugbot review.
Both methods are unnecessary in this PR. The base class and callers already default correctly when the methods are absent: - supports_function_calling: callers use getattr with False default - supports_stop_words: base class already returns True These will be added back in the tool calling follow-up PR.
Remove json, re imports and _OCI_SCHEMA_NAME_PATTERN regex that are only needed for structured output (not in this PR scope).
Use model_lower instead of model in the dot check to match the convention used by all other providers in _matches_provider_pattern. Flagged by Cursor Bugbot.
The explicit OCI branch returned the same _matches_provider_pattern call as the generic fallback. Removing it since it adds no distinct logic. Flagged by Cursor Bugbot.
- Add missing empty-string fallback for bare strings in list content - Use case-insensitive model checks consistently across all methods - Replace over-engineered FIFO ticket queue with simple threading.Lock
…ature - Move _normalize_messages inside llm_call_context and try/except so validation errors emit call_failed events consistently - Narrow _call_impl to accept list[LLMMessage] only, removing unreachable str normalization path
b5130c4 to
7ef6c02
Compare
|
Rebased on latest Unit tests — 15/15 passed (Python 3.13, mocked OCI SDK)
Branch is clean — 8 commits rebased on current main, no merge commits. @greysonlalonde — would really appreciate a review on this foundational PR when you get a chance. The remaining 6 PRs in the series (#4961–#4982) all build on top of this one. Happy to address any feedback! |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 7ef6c02. Configure here.
| models.CohereChatBotMessage( | ||
| message=self._coerce_text(content) or " ", | ||
| ) | ||
| ) |
There was a problem hiding this comment.
Cohere empty content handling is inconsistent across roles
Low Severity
In _build_cohere_chat_history, empty content for assistant messages is replaced with " " (a space character) via self._coerce_text(content) or " ", but empty content for user and system messages passes through self._coerce_text(content) with no fallback, yielding an empty string "". If the Cohere API rejects empty message fields on CohereUserMessage or CohereSystemMessage, this inconsistency would cause API errors for user/system messages while silently working for assistant messages.
Reviewed by Cursor Bugbot for commit 7ef6c02. Configure here.
|
|
||
| def _get_oci_module() -> Any: | ||
| """Backward-compatible module-local alias used by tests and patches.""" | ||
| return get_oci_module() |
There was a problem hiding this comment.
Wrapper function adds unnecessary indirection over direct import
Low Severity
The module-level _get_oci_module function is a trivial wrapper that delegates to get_oci_module (already imported on line 14). Its docstring says it exists for test patching, but tests could instead directly monkeypatch the imported get_oci_module reference at crewai.llms.providers.oci.completion.get_oci_module. Since create_oci_client_kwargs already receives the module explicitly via oci_module=self._oci, patching the single local reference would be sufficient.
Reviewed by Cursor Bugbot for commit 7ef6c02. Configure here.


Summary
OCICompletion) supporting generic (Meta, Google, OpenAI, xAI) and Cohere model familiesutilities/oci.py) for API key, security token, instance principal, and resource principal authcrewai[oci])meta.llama-3.3-70b-instruct,cohere.command-r-plus-08-2024,google.gemini-2.5-flash,openai.gpt-5.2-chat-latest)This is PR 1 of a series — follow-up PRs will add streaming, tool calling, structured output, multimodal, and embeddings support.
Supersedes #4885 (closed per reviewer feedback to split by scope).
Tracking issue: #4944
What's included
utilities/oci.pyget_oci_module()+create_oci_client_kwargs()llms/providers/oci/completion.pyOCICompletion(BaseLLM)— init, message building, basic call/acallllm.pypyproject.tomlocioptional dependencyWhat's NOT included (deferred to follow-up PRs)
iter_stream,astream)response_model)Test plan
meta.llama-3.3-70b-instructcohere.command-r-plus-08-2024google.gemini-2.5-flashopenai.gpt-5.2-chat-latestcall) and async (acall) paths verifiedNote
Medium Risk
Adds a new native LLM provider with multiple OCI authentication modes and new routing/pattern matching, which could affect model selection and introduce integration/auth edge cases. Changes are mostly additive and isolated, but involve external SDK calls and credential handling.
Overview
Adds first-class support for Oracle Cloud Infrastructure (OCI) Generative AI as a native
LLMprovider, including provider routing (oci/...) and model-pattern validation for OCI model IDs and dedicated endpoint OCIDs.Introduces
OCICompletionfor basic synchronous/async chat-based text completion (generic + Cohere formats), plus shared OCI SDK utilities for lazy importing and building client auth kwargs (API key, security token, instance/resource principals). Also registersocias an optional dependency (crewai[oci]) and adds mocked unit tests plus live integration tests for the new provider.Reviewed by Cursor Bugbot for commit 7ef6c02. Bugbot is set up for automated code reviews on this repo. Configure here.