Skip to content

[agentserver] azure-ai-agentserver-responses package#46052

Open
ankitbko wants to merge 181 commits intomainfrom
agentserver/responses
Open

[agentserver] azure-ai-agentserver-responses package#46052
ankitbko wants to merge 181 commits intomainfrom
agentserver/responses

Conversation

@ankitbko
Copy link
Copy Markdown
Member

@ankitbko ankitbko commented Apr 1, 2026

Description

Split from PR #45925 for independent review of the azure-ai-agentserver-responses package.

This PR builds on the core+invocations changes in #45925 and adds the responses protocol implementation including:

  • Hosting: Starlette-based ASGI routing, request validation, background execution, SSE streaming
  • Storage: Foundry storage provider using azure.core.AsyncPipelineClient, in-memory provider for testing
  • Streaming: Response event stream builders for messages, function calls, reasoning
  • Models: TypeSpec-generated models with validation

Dependencies


Carried-over review threads from #45925

The following unresolved review threads were moved from the original PR. They all target code in this package.

Code fixes needed (copilot-reviewer)

File Issue
_http_errors.py _not_found returns 404 but uses code="invalid_request" — should use code="not_found" and error_type="not_found_error"
_handlers.py response_handler validates parameter count but does not validate handler is async def or returns AsyncIterable
_foundry_serializer.py serialize_create_request calls .as_dict() unconditionally — breaks if a dict is passed instead of a model
_base.py ResponseProviderProtocol docstrings specify KeyError but Foundry provider raises FoundryResourceNotFoundError (2 threads, duplicate)
_runtime_state.py O(n²) history concatenation via [*A, *history] — use deque or accumulate-then-concat
_event_subject.py Unbounded asyncio.Queue per subscriber with no backpressure — can cause OOM (2 threads, duplicate)
_foundry_settings.py build_url() uses manual string concatenation — switch to urllib.parse.urlencode
_request_parsing.py Unnecessary backslash continuations inside parentheses — reformat for clarity

Design/architecture (johanste)

File Issue
_routing.py Routing/hosting restructure — see comment. johanste proposes ResponseHandler should inherit from AgentHost to be the host entry point
samples/GetStarted/app.py Samples should not need to import anything from core
samples/GetStarted/app.py ResponseHandler should be the "host" — users should not need AgentHost directly

Samples (RaviPidaparthi)

File Issue
samples/README.md Need more samples: Tier 1/2/3 experience, OpenAI client integration. Reference: .NET samples

lusu-msft and others added 30 commits March 17, 2026 13:03
- added type spec model generation
- add model validator generation
- creating a server
* trying

* generate contract models

* add validator generator

* fix model generation

* add more unit tests

* fix conflict

* refined model generation
* trying

* generate contract models

* add validator generator

* fix model generation

* add more unit tests

* fix conflict

* refined model generation

* renamed the pacakge
* create response

* cancel and delete

* fix options
Ankit Sinha and others added 20 commits April 7, 2026 20:32
…on deserialization

- Fix C-MSG-01: Input items without 'type' now default to 'message' per OpenAI spec
  - Generator pipeline: validation-overlay.yaml adds default_discriminator
  - Schema walker passes defaultValue through to emitter
  - Emitter generates value.get('type', 'message') fallback
  - Helper get_input_expanded() injects type='message' for items missing it

- Fix temperature/top_p: Override Optional[int] → Optional[float] via _patch.py
  - Uses official subclass customization pattern (not monkey-patching)
  - CreateResponse and ResponseObject subclassed with correct rest_field types

- Fix union deserialization: Patch _deserialize_sequence to reject plain strings
  - model_base.py bug: str treated as iterable sequence in Union[dict,str,list]
  - Makefile applies sed patch after each generate-models run

- Fix Makefile: Use pinned npm deps from emitter-package.json via tsp-client sync
  - Removed hardcoded LOCAL_TYPESPEC_PACKAGES variable
  - npm install runs inside TempTypeSpecFiles/ using pinned package.json
  - Added generate-openapi target; fixed spec path to virtual-public-preview

- Add OpenAI interop test suite: 84 tests (53 wire compliance + 31 SDK round-trip)
  - All 560 tests pass, 0 xfailed
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- request_span (both _base.py and _tracing.py): added :type:, :keyword:,
  :paramtype:, :rtype:, and missing instrumentation_scope doc
- _handle_sigterm: prefixed unused args with _ (_signum, _frame)
- sse_keepalive_stream: added :type: and :rtype:
- end_span, flush_spans, record_error, trace_stream: added :type:
- _BaggageLogRecordProcessor.on_emit: added :param log_data:

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…leanup

- Add 12 new contract test files (39 tests) for .NET protocol parity:
  agent_reference_auto_stamp, bg_stream_disconnect, cancel_consistency,
  connection_termination, conversation_store, handler_driven_persistence,
  output_manipulation_detection, response_id_auto_stamp, response_id_header,
  sentinel_removal, session_id_resolution, snapshot_consistency

- Fix 6 implementation gaps identified by parity analysis:
  - Conformant ID format in tests (resp_/item_/msg_ prefixes)
  - agent_reference propagation (return {} for None)
  - FR-008a output manipulation detection before state validation
  - run_background waits for response.created, default in_progress
  - FR-013 background+stream disconnect shielding via asyncio.shield()
  - CancelledError: re-raise unknown instead of transitioning to failed

- Remove internal spec numbers (FR/S/B) from client-facing error messages

- Add ruff exclude for _generated directory in pyproject.toml
- Fix E501, F821, F841 ruff violations in hand-written source and tests
- Revert accidental ruff --fix changes to generated model files

613 passed, 0 failed, 8 skipped
Ensures _on_ending span processor method and stable baggage/log APIs
are available. Avoids edge cases with older SDK versions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…e history limit

- get_input_expanded now returns list[Item] with proper discriminated
  subtypes via _deserialize, enabling isinstance checks instead of
  dict-style type inspection
- Replace all hardcoded 10000 history item counts in orchestrator with
  configurable default_fetch_history_count from ResponsesServerOptions
- Add TextResponse convenience class for text-only handlers, supporting
  both complete text (create_text) and streaming (create_text_stream)
- Add StreamingTextDeltas sample demonstrating create_text_stream
- Update GetStarted and ConversationHistory samples to use TextResponse
- Update FunctionCalling sample/scenario and tests for isinstance checks
- Fix ruff lint issues in sample files
…/core/_base.py

Co-authored-by: Johan Stenberg (MSFT) <johan.stenberg@microsoft.com>
…ersation stamping, 22 new tests

Aligned with Azure.AI.AgentServer.Responses .NET PR #57895:

Rule number updates:
- Removed dashes from B-XX references (B-11→B11, B-13→B13, etc.)
- Remapped S-rules to formalized spec numbering (S-047→B38, S-048→B39,
  S-007→FR-006, S-008→FR-007, S-021→S-015, etc.)

Cancel hardening:
- Fixed cancel persistence bug in _finalize_stream
- Added 10s cancel grace period in handle_cancel before forcing terminal

Delete stream cleanup:
- Added delete_stream_events() to protocol and InMemoryResponsesProvider
- Handle_delete now cleans up stream events after deleting response

Store behavior:
- B14: store=false SSE replay now returns 404 (was 400)

Conversation stamping (S-040):
- Normalized polymorphic conversation (str|ConversationParam_2) to
  ConversationReference in ResponseEventStream init
- Thread conversation_id through apply_common_defaults and all
  orchestrator call sites to forcibly stamp on lifecycle events
- Fixed _resolve_conversation_id to handle dict form

Queued status honour:
- Background non-stream responses now honour handler-set queued status
- Orchestrator transitions through in_progress when going from queued
  to a terminal state

New tests (22 total, 635 pass):
- 8 conversation round-trip tests (string/object, default/streaming/bg)
- 3 queued status lifecycle tests
- 2 GET endpoint tests (Accept header, store=false)
- 2 cancel tests (signal trigger, B11 race condition)
- 1 delete test (bg completed response)
- 6 additional contract test improvements

Code formatting: applied ruff format across all files
- _routing.py: use AgentConfig.from_env() for Foundry auto-activation
  and SSE keep-alive merge instead of private _ENV_* imports
- _foundry_settings.py: add from_endpoint() classmethod; from_env()
  delegates to AgentConfig
- _request_parsing.py: _resolve_session_id() takes env_session_id kwarg
  instead of reading os.environ directly
- _endpoint_handler.py: passes host.config.session_id to resolver
- _options.py: remove SSE_KEEPALIVE_INTERVAL from from_env(); AgentConfig
  handles it; inline DEFAULT_FETCH_HISTORY_ITEM_COUNT string
- core _config.py: fix _DEFAULT_SSE_KEEPALIVE_INTERVAL from 15 to 0
  (disabled by default per container spec)
- Revert broken S-024 unknown CancelledError fix (re-add skip)

636 passed, 8 skipped
@ankitbko ankitbko requested review from a team, kashifkhan and xiangyan99 as code owners April 8, 2026 15:48
Ankit Sinha and others added 4 commits April 8, 2026 16:48
)

Replace the heuristic three-var + FOUNDRY_ENVIRONMENT check with a
simple opt-in: is_hosted returns True only when FOUNDRY_HOSTED=true,
which the platform injects exclusively into hosted containers.

This prevents false-positive Foundry storage auto-activation when a
developer sets FOUNDRY_PROJECT_ENDPOINT or other FOUNDRY_* vars locally
(e.g. to use agent_framework features) without intending to run in a
hosted environment.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Hosted Agents sdk/agentserver/*

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants