[agentserver] azure-ai-agentserver -core and -invocation packages#45925
[agentserver] azure-ai-agentserver -core and -invocation packages#45925
Conversation
- added type spec model generation - add model validator generation - creating a server
* trying * generate contract models * add validator generator * fix model generation * add more unit tests * fix conflict * refined model generation
* trying * generate contract models * add validator generator * fix model generation * add more unit tests * fix conflict * refined model generation * renamed the pacakge
* create response * cancel and delete * fix options
Routes are now collected by each mixin and passed up through super().__init__(routes=...) rather than appending to self.routes after construction. This avoids relying on Starlette's internal mutable routes list and ensures all routes are registered at Starlette initialization time. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…methods Reduced _tracing.py from 1009 to 390 lines: - Removed leaf_customer_span_id baggage mechanism (_parse_baggage_key, _override_parent_span_id, _extract_context baggage override) - Removed set_baggage/detach_baggage/set_current_span/detach_context - Replaced start_request_span + manual context with request_span(end_on_exit=False) using start_as_current_span — child spans are correctly parented - Folded span_name, build_span_attrs, _prepare_request_span_args, span(), start_span() into a single request_span() context manager - TracingHelper public API is now: request_span, end_span, record_error, trace_stream - Updated invocations to use simplified tracing (removed baggage_token, span_token) - Removed baggage constants from InvocationConstants - Removed _parse_baggage_key tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Starlette's TestClient runs synchronously in the same thread context, so OTel ContextVar propagation works correctly. HTTPX's ASGITransport runs the ASGI app in a different async context where ContextVars don't propagate, causing span parenting to break in tests only. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
sdk/agentserver/azure-ai-agentserver-core/azure/ai/agentserver/core/__init__.py
Show resolved
Hide resolved
sdk/agentserver/azure-ai-agentserver-core/azure/ai/agentserver/core/_base.py
Outdated
Show resolved
Hide resolved
...ntserver/azure-ai-agentserver-invocations/samples/simple_invoke_agent/simple_invoke_agent.py
Outdated
Show resolved
Hide resolved
...er/azure-ai-agentserver-invocations/samples/streaming_invoke_agent/streaming_invoke_agent.py
Outdated
Show resolved
Hide resolved
...gentserver/azure-ai-agentserver-invocations/samples/async_invoke_agent/async_invoke_agent.py
Outdated
Show resolved
Hide resolved
…/core/_base.py Co-authored-by: Johan Stenberg (MSFT) <johan.stenberg@microsoft.com>
…_invoke_agent/async_invoke_agent.py Co-authored-by: Johan Stenberg (MSFT) <johan.stenberg@microsoft.com>
sdk/agentserver/azure-ai-agentserver-core/azure/ai/agentserver/core/__init__.py
Show resolved
Hide resolved
| Automatically configures Azure Monitor and OTLP exporters when the | ||
| corresponding environment variables are set. All span creation and | ||
| lifecycle is managed by the host framework -- developers never interact | ||
| with this class directly. |
There was a problem hiding this comment.
Nit: Let's add a doc string for connection_string
| log_provider = LoggerProvider(resource=resource) | ||
| set_logger_provider(log_provider) | ||
| log_provider.add_log_record_processor(BatchLogRecordProcessor( | ||
| OTLPLogExporter(endpoint=endpoint))) # type: ignore[union-attr] |
There was a problem hiding this comment.
Do we need to call addHandler here, as well (similar to line 344)?
| trace_provider = _ensure_trace_provider(resource) | ||
|
|
||
| if trace_provider is not None: | ||
| trace_provider.add_span_processor(_FoundryEnrichmentSpanProcessor( |
There was a problem hiding this comment.
Since this is the global TracerProvider, this span processor will be added multiple times if TracingHelper is initialized multiple times. Can we restrict this to one time only?
| from opentelemetry import trace | ||
| from opentelemetry.sdk.trace import TracerProvider as SdkTracerProvider | ||
| from opentelemetry.sdk.trace.export import SimpleSpanProcessor | ||
| from opentelemetry.sdk.trace.export.in_memory import InMemorySpanExporter |
There was a problem hiding this comment.
Looks like these tests are being skipped due to a broken import:
| from opentelemetry.sdk.trace.export.in_memory import InMemorySpanExporter | |
| from opentelemetry.sdk.trace.export.in_memory_span_exporter import InMemorySpanExporter |
| return Response(content=store[inv_id]) | ||
| return JSONResponse({"error": {"code": "not_found", "message": "Not found"}}, status_code=404) | ||
|
|
||
| @app.cancel_invocation_handler |
There was a problem hiding this comment.
| @app.cancel_invocation_handler | |
| @server.cancel_invocation_handler |
| return JSONResponse({"status": "cancelled"}) | ||
| return JSONResponse({"error": {"code": "not_found", "message": "Not found"}}, status_code=404) | ||
|
|
||
| return app |
There was a problem hiding this comment.
| return app | |
| return server |
| ### Features Added | ||
|
|
||
| - Initial release of `azure-ai-agentserver-invocations`. | ||
| - `InvocationHandler` for wiring invocation protocol endpoints to an `AgentHost`. |
| - Replaced `ErrorResponse.create()` static method with module-level `create_error_response()` function. | ||
| - Replaced `AgentLogger.get()` static method with module-level `get_logger()` function. | ||
| - Removed `AGENT_LOG_LEVEL` and `AGENT_GRACEFUL_SHUTDOWN_TIMEOUT` environment variable support from `Constants`. | ||
| - Renamed health endpoint from `/healthy` to `/readiness`. |
There was a problem hiding this comment.
Let's keep the entry for the 1.0.0b1 release before this for historical reference.
| ### Simple synchronous agent | ||
|
|
||
| ```python | ||
| from azure.ai.agentserver.invocations import InvocationAgentServerHost |
There was a problem hiding this comment.
These code snippets are importing InvocationAgentServerHost twice.
| @@ -0,0 +1,220 @@ | |||
| # Azure AI AgentServerHost Invocations for Python | |||
There was a problem hiding this comment.
This is failing the Verify Readmes check because it doesn't conform to:
* ^Azure (.+ client library for Python|Smoke Test for Python|AI Agent Server Adapter for .*Python) located here
How about we update the regex to AI Agent Server.* for .*Python) to cover all the packages.
Then can update this to # Azure AI Agent Server Invocations Host for Python or something similar.
The core library README would also need to be updated to fit the pattern, as well.
|
Approve as work in progress |
johanste
left a comment
There was a problem hiding this comment.
Approve as work in progress. Remaining work to align with spec etc. is being tracked.
johanste
left a comment
There was a problem hiding this comment.
Actually, docs build is failing. Need to investigate
Since it is still `import`ed from azure-ai-agentserver-invocations, we need to keep it for now.
|
/check-enforcer override |
|
@ankitbko Left some comments as unresolved for potential follow-up. |
- Added docstring for TracingHelper.__init__ connection_string param - Added enrichment processor dupe guard (_enrichment_configured flag) - Fixed InMemorySpanExporter import path in test_tracing.py - Fixed @app → @server variable name mismatch in test_tracing.py - Updated invocations CHANGELOG with 2.0.0b1 + kept 1.0.0b1 history - Fixed duplicate InvocationAgentServerHost imports in README - Fixed README titles to match Verify Readmes pattern - Fixed tracing tests to use TestClient and set APPLICATIONINSIGHTS env var - Fixed test pollution: OTel provider reused across test modules - Removed obsolete baggage test Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Address pvaneck review comments from PR #45925 - Added docstring for TracingHelper.__init__ connection_string param - Added enrichment processor dupe guard (_enrichment_configured flag) - Fixed InMemorySpanExporter import path in test_tracing.py - Fixed @app → @server variable name mismatch in test_tracing.py - Updated invocations CHANGELOG with 2.0.0b1 + kept 1.0.0b1 history - Fixed duplicate InvocationAgentServerHost imports in README - Fixed README titles to match Verify Readmes pattern - Fixed tracing tests to use TestClient and set APPLICATIONINSIGHTS env var - Fixed test pollution: OTel provider reused across test modules - Removed obsolete baggage test Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix CHANGELOGs: add 1.0.0b1 history to core, keep invocations at 1.0.0b1 Core: added historical 1.0.0b1 entry below 2.0.0b1, removed stale leaf_customer_span_id from features. Invocations: reverted to single 1.0.0b1 entry (new package, no prior releases). Updated feature list to reflect InvocationAgentServerHost. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Refactor TracingHelper class → module functions + host method Replaced the TracingHelper class with: - configure_tracing() — standalone function for exporter setup, overridable via AgentServerHost(configure_tracing=my_func) or disabled with configure_tracing=None - request_span() — module-level context manager for span creation - end_span/record_error/trace_stream — module-level lifecycle helpers - AgentServerHost.request_span() — thin method that delegates with pre-populated host identity (agent_id, project_id) Protocol SDKs now use self.request_span() instead of self._tracing.request_span() with None checks. All functions are no-ops when opentelemetry-api is not installed. Removed TracingHelper from core __init__.py exports. Agent identity (name/version/project_id) moved to AgentServerHost. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Update core sample to use new tracing API (self.request_span) Removed TracingHelper import, contextlib.nullcontext pattern, and None checks. Now uses self.request_span() and _tracing.record_error(). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Remove get_logger() — use logging.getLogger directly get_logger() was a one-liner wrapping logging.getLogger('azure.ai.agentserver'). Replaced all usages with direct logging.getLogger() calls and deleted _logger.py. Removed get_logger from core __init__.py exports. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Make OTel a primary dependency, remove _HAS_OTEL guard Moved opentelemetry-api, opentelemetry-sdk, opentelemetry-exporter-otlp, and azure-monitor-opentelemetry-exporter from optional [tracing] extras to primary dependencies. Removed _HAS_OTEL flag, try/except import guard, and all conditional checks — OTel is always available. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Update sdk/agentserver/azure-ai-agentserver-invocations/tests/test_tracing.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update sdk/agentserver/azure-ai-agentserver-invocations/tests/test_tracing.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update sdk/agentserver/azure-ai-agentserver-core/azure/ai/agentserver/core/_tracing.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * removed redundant packages * Address PR review: expose tracing functions publicly, fix imports, update CHANGELOG - Exported end_span, record_error, trace_stream from core __init__.py (no more importing internal _tracing module from other packages) - Updated invocations to use public imports from core - Updated selfhosted sample to use public record_error import - Added None guard in _wrap_streaming_response for otel_span - Fixed test docstring mismatch (tracing_disabled_by_default) - Updated CHANGELOG to reflect TracingHelper → functions change - Fixed get_logger import in githubcopilot adapter Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Spec compliance: platform header, tracer scope, error attrs, baggage, isolation headers, HTTP/2 - x-platform-server header now includes version and python runtime - Instrumentation scope: Azure.AI.AgentServer (core), .Invocations (invocations) - record_error() now sets error.type attribute per OTel semantic conventions - baggage header included in W3C trace context extraction - x-request-id propagated into span attributes - Platform isolation headers (x-agent-user-isolation-key, x-agent-chat-isolation-key) exposed via request.state - HTTP/2 disabled in Hypercorn config (spec requires HTTP/1.1 only) - Fixed get_logger import in githubcopilot adapter Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Spec compliance: baggage keys, SSE keep-alive, structured logging, SIGTERM forwarding - Re-added W3C baggage propagation for invocation_id/session_id - Added SSE_KEEPALIVE_INTERVAL env var and resolve_sse_keepalive_interval() - Added sse_keepalive_stream() as AgentServerHost static method (not in tracing) - Added _InvocationLogFilter for structured log scope with InvocationId/SessionId - Added SIGTERM handler in run() that logs and re-raises for Hypercorn - Separated trace_stream (tracing concern) from sse_keepalive_stream (transport) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix PR review: contextvars for logging, SIGTERM restore, deduplicate log handler - Replaced per-request logger.addFilter/removeFilter with contextvars (_invocation_id_var, _session_id_var) for concurrency-safe structured logging. Filter installed once at module level, reads from contextvars. - SIGTERM handler now restored in finally block after run() exits. - _setup_otlp_log_export only adds LoggingHandler when Azure Monitor handler is not already configured, preventing duplicate log emission. - Fixed Black formatting in test_tracing.py dict literals. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Move azure-monitor-opentelemetry-exporter to optional dep The CI dev-build tool (process_requires) rewrites all azure-* dependency version specs for dev builds, transforming >=1.0.0b21 into >=1.0.0a1,<1.0.0b0 which is unresolvable. The exporter is imported lazily with try/except in _tracing.py so it works correctly when installed separately. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Revert "Move azure-monitor-opentelemetry-exporter to optional dep" This reverts commit 973a32b. * Add azure-monitor-opentelemetry-exporter to CI Artifacts The CI dev-build tool (process_requires) rewrites azure-* dependency version specs. Adding the exporter to Artifacts ensures a compatible dev build is published to the dev feed alongside agentserver packages. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix PR comments: log filter comment, tracing docstring, changelog baggage - Updated _ensure_log_filter comment to say 'first request' not 'module load' - Updated _tracing.py docstring: OTel is required, not optional - Added W3C Baggage and structured logging to invocations CHANGELOG Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Revert "Add azure-monitor-opentelemetry-exporter to CI Artifacts" This reverts commit 15aa0c5. * Fix PR comments: keepalive shield, SIGTERM comment, docstring, thread-safe filter - sse_keepalive_stream: use asyncio.shield to prevent cancelling upstream iterator on timeout. Reuse pending task across timeouts. - SIGTERM handler comment: clarified it logs and re-raises, not forwards. - request_span docstring: removed stale 'no-op when OTel not installed'. - _ensure_log_filter: added threading.Lock for double-checked locking. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix: duplicate imports, sanitize IDs in get/cancel endpoints, copilot adapter logging - Removed duplicate 'import logging' in _invocation.py - Added _sanitize_id for invocation_id and session_id in _traced_invocation_endpoint - Fixed _copilot_adapter.py: consolidated logging import, removed _logging alias Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Spec compliance: baggage propagator, x-request-id baggage, baggage-to-log processor - Replaced TraceContextTextMapPropagator with CompositePropagator (TraceContextTextMapPropagator + W3CBaggagePropagator) to properly extract inbound baggage header into OTel context. - x-request-id now set as both span attribute AND baggage entry for downstream propagation. - Added _BaggageLogRecordProcessor that copies all W3C Baggage entries into every OTel log record's attributes for end-to-end correlation. Registered on both Azure Monitor and OTLP log providers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix import ordering: stdlib → third-party → local per ruff/isort - _base.py: moved 'import sys' to stdlib group - _tracing.py: moved opentelemetry imports to third-party group before constants - _invocation.py: moved contextvars to stdlib group, opentelemetry to third-party, removed duplicate import Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix _BaggageLogRecordProcessor: rename emit → on_emit per OTel SDK API The OTel SDK's LogRecordProcessor interface requires on_emit(), not emit(). This caused AttributeError when the log handler tried to process log records through the processor chain. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add AgentConfig dataclass on app.config, replace Constants New frozen dataclass AgentConfig populated from env vars at init time: app.config.agent_name, .agent_version, .agent_id, .project_id, .project_endpoint, .session_id, .port, .appinsights_connection_string, .otlp_endpoint, .sse_keepalive_interval - Replaced Constants class with private _ENV_* constants in _config.py - AgentServerHost stores self.config = AgentConfig.from_env() - Invocations uses self.config.session_id instead of os.environ.get() - Exported AgentConfig from core __init__.py - Updated all tests to use string literals for env var names Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Set service.name to agent_name, add operation_name to get/cancel spans - service.name span attribute now uses agent_name (falls back to 'azure.ai.agentserver' when agent_name is empty) - _traced_invocation_endpoint now passes operation_name for get_invocation and cancel_invocation spans Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Suppress noisy Azure SDK and OTel exporter INFO logs Set azure.core.pipeline.policies.http_logging_policy and azure.monitor.opentelemetry.exporter loggers to WARNING by default to avoid flooding stderr with HTTP request/response details and exporter transmission status. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Set cloud_RoleName to agent name via OTel Resource service.name The OTel Resource's service.name attribute maps to cloud_RoleName in App Insights. Now set to FOUNDRY_AGENT_NAME (falls back to 'azure.ai.agentserver' when not set). This ensures both spans and logs show the agent name as the cloud role in App Insights. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Use _config._ENV_FOUNDRY_AGENT_NAME instead of hardcoded string Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(core): add flush_spans() to drain BatchSpanProcessor before sandbox suspend (#46181) BatchSpanProcessor exports spans on a background timer (default 5s). In hosted sandbox environments the platform may suspend the process immediately after an HTTP response is sent, before the timer fires. This causes short-lived spans (e.g. LangGraph per-node invoke_agent spans) to be lost. Add flush_spans() to the core public API and call it from trace_stream's finally block so the streaming path also flushes. Verified: same agent produces 19-31 spans with flush vs 11 without. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Update sdk/agentserver/azure-ai-agentserver-core/azure/ai/agentserver/core/_config.py Co-authored-by: Johan Stenberg (MSFT) <johan.stenberg@microsoft.com> * Update sdk/agentserver/azure-ai-agentserver-core/azure/ai/agentserver/core/_config.py Co-authored-by: Johan Stenberg (MSFT) <johan.stenberg@microsoft.com> * fix(agentserver-core): move agent identity attrs to _on_ending in FoundryEnrichmentSpanProcessor (#46186) * fix: move agent identity attrs to _on_ending in FoundryEnrichmentSpanProcessor Move gen_ai.agent.name, gen_ai.agent.version, and gen_ai.agent.id from on_start to _on_ending so underlying frameworks (LangChain, Semantic Kernel, etc.) cannot overwrite them. Uses guarded direct _attributes access as a workaround for opentelemetry-sdk <=1.40.0 spec-compliance gap where set_attribute() is a no-op during _on_ending. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: add shutdown() to _CollectorExporter for SDK compatibility Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Remove redundant shutdown method definition --------- Co-authored-by: Neehar Duvvuri <neeharduvvuri@Neehars-MacBook-Pro.local> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix: update type imports and add missing method return type annotation * removed dataclass * Fix cspell errors: coro→coroutine, reraises→re_raises, sess→session Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix mypy: add type ignore for _BaggageLogRecordProcessor arg-type Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix pylint docstrings: add types, keyword args, rtype, unused arg prefix - request_span (both _base.py and _tracing.py): added :type:, :keyword:, :paramtype:, :rtype:, and missing instrumentation_scope doc - _handle_sigterm: prefixed unused args with _ (_signum, _frame) - sse_keepalive_stream: added :type: and :rtype: - end_span, flush_spans, record_error, trace_stream: added :type: - _BaggageLogRecordProcessor.on_emit: added :param log_data: Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Bump minimum opentelemetry version to 1.33.0 Ensures _on_ending span processor method and stable baggage/log APIs are available. Avoids edge cases with older SDK versions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Bump opentelemetry dependencies to version 1.40.0 * Bump opentelemetry dependencies to version 1.40.0 in dev requirements * Update sdk/agentserver/azure-ai-agentserver-core/azure/ai/agentserver/core/_base.py Co-authored-by: Johan Stenberg (MSFT) <johan.stenberg@microsoft.com> * updated codeowner --------- Co-authored-by: Ankit Sinha <anksinha@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Nagkumar Arkalgud <nagkumar91@users.noreply.github.com> Co-authored-by: Johan Stenberg (MSFT) <johan.stenberg@microsoft.com> Co-authored-by: Neehar Duvvuri <40341266+needuv@users.noreply.github.com> Co-authored-by: Neehar Duvvuri <neeharduvvuri@Neehars-MacBook-Pro.local>
x-agent-response-idheader)agent_session_idon allresponse.*eventsagent_session_idto_ExecutionContextand thread through pipeline## 1.0.0b1 (Unreleased)section__version__in responses__init__.py_generated import *from__init__.py, manually re-export top-level helpersmap_responses_server()in README withResponseHandler()get_input_text,get_input_expanded,get_conversation_idat top levelgetattr(request, "model", None)→request.modelin samples and README