Python: [BREAKING] Standardize orchestration terminal outputs as AgentResponse by moonbox3 · Pull Request #5301 · microsoft/agent-framework

moonbox3 · 2026-04-16T06:28:42Z

Motivation and Context

Important

Breaking changes are scoped to the still-experimental agent-framework-orchestrations package. Core agent-framework workflow code is non-breaking.

workflow.as_agent().run(prompt) for SequentialBuilder returned the user's input plus every agent's reply instead of the last agent's answer. The same class of bug affected Concurrent, GroupChat, Magentic, and Handoff: they yielded list[Message] conversation dumps as output events, so as_agent() consumers got a conversation dump instead of an answer.

This PR standardizes the terminal-output shape for every built-in orchestration to AgentResponse (or per-chunk AgentResponseUpdate while streaming). One contract: each orchestration's terminal output event carries the meaningful answer, not a conversation transcript.

Per-orchestration contract

Orchestration	Terminal `output` event
Sequential (agent terminator)	Last agent's `AgentResponse` (streaming chunks — the last `AgentExecutor` itself is the workflow's output executor)
Sequential (custom-executor terminator)	The custom executor's own `AgentResponse` — see "Custom terminators" below
Concurrent (default aggregator)	`AgentResponse` with one assistant message per participant (no user prompt prepended)
Concurrent (custom aggregator)	Whatever the aggregator yields
GroupChat	`AgentResponse` with the orchestrator's completion message
Magentic	`AgentResponse` with the manager's synthesized final answer
Handoff	No terminal yield — per-agent responses surface as `output` events as agents speak

SequentialBuilder simplification

SequentialBuilder.build() no longer wraps the chain with a synthetic terminal node (_EndWithConversation removed). The last participant is registered as the workflow's output_executor directly. Two consequences:

Custom-executor terminators must yield directly. A custom Executor used as the last participant must call await ctx.yield_output(AgentResponse(messages=[...])). Intermediate custom executors continue to use ctx.send_message(list[Message]) for chaining.
Output type uniformity. The terminal output is AgentResponse, not list[Message].

The one in-tree sample using this pattern (samples/03-workflows/orchestrations/sequential_custom_executors.py) and one test were updated.

intermediate_outputs=True on builders continues to work as on main — it flips the workflow's output_executors whitelist off so every executor's yield_output surfaces as a workflow output event. No semantic change.

HIL terminator support

AgentApprovalExecutor now accepts allow_direct_output. When the HIL-wrapped agent is the last participant in SequentialBuilder, the inner approval workflow's response surfaces as the wrapping workflow's terminal output event. A small sibling class _TerminalAgentRequestInfoExecutor yields AgentResponse (instead of AgentExecutorResponse) in the terminator role, so the unwrap happens at the source rather than at the WorkflowExecutor boundary in core.

Behavior changes (orchestrations package)

workflow.as_agent().run(prompt) returns only the answer for each orchestration, matching the agent contract.
Sequential / GroupChat / Magentic terminal output: list[Message] (full conversation) → AgentResponse (answer only).
Concurrent default aggregator: list[Message] (user + per-agent) → AgentResponse (per-agent only).
Handoff: no synthetic terminal event. Per-agent output events are the end of the stream.
_EndWithConversation removed from SequentialBuilder; custom-executor terminators must yield_output(AgentResponse) directly.

Core changes (non-breaking)

WorkflowExecutor handlers annotated WorkflowContext[Any, Any] (was WorkflowContext[Any]) — permissive, lets WorkflowExecutor serve as a workflow's output_executor when allow_direct_output=True.
WorkflowAgent's streaming converter constructs fresh AgentResponseUpdate instances instead of mutating the executor's payload — independent mutation-safety fix surfaced during review.

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows the Contribution Guidelines
All unit tests pass, and I have added new tests where possible
Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

…. Align other orchestration outputs

moonbox3 · 2026-04-16T06:31:17Z

Python Test Coverage Report •

File	Stmts	Miss	Cover	Missing
packages/core/agent_framework/_workflows
_agent.py	358	71	80%	66, 74–80, 116–117, 210, 271, 284, 351, 362, 364, 424, 430, 440–441, 448, 450, 456, 518–519, 528, 536, 562, 595–597, 599, 601, 603, 608, 613, 667, 697, 714, 753–756, 762, 768, 772–773, 776–782, 786–787, 795, 856, 863, 869–870, 881, 913, 920, 941, 950, 954, 956–958, 965
_agent_executor.py	200	16	92%	166, 190, 231, 255, 275–276, 356–358, 360, 370–371, 490–491, 563, 569
_workflow_executor.py	187	30	83%	94, 404, 464, 487, 489, 497–498, 503, 505, 510, 512, 593–599, 603–605, 613, 618, 629, 639, 643, 649, 653, 663, 667
packages/orchestrations/agent_framework_orchestrations
_base_group_chat_orchestrator.py	172	12	93%	109, 277, 292, 326–328, 332, 351, 439, 485–487
_concurrent.py	144	26	81%	51, 60–61, 69–70, 89–90, 95, 113, 118, 123–124, 145, 155, 162, 231, 247, 250, 307, 337, 339–340, 342, 347, 360, 364
_group_chat.py	329	70	78%	175, 344, 351, 384, 395–396, 402, 407, 428, 432, 445–446, 459, 474–475, 477, 493, 520–525, 527, 561–564, 566, 571–575, 664, 667, 706, 709, 712, 715, 723, 735–736, 738–739, 741–742, 744, 749, 752, 761, 767, 811–812, 816–817, 831–832, 834–835, 866–867, 933, 952, 960, 965–967, 974, 984
_handoff.py	335	52	84%	104–105, 107, 160–170, 172, 174, 176, 181, 312, 337, 364, 390, 454, 497, 505, 509–510, 541–543, 548–550, 670, 673, 686, 748, 753, 760, 770, 772, 791, 793, 875–876, 908–909, 1010, 1017, 1089–1090, 1092
_magentic.py	592	91	84%	65–74, 79, 83–94, 259, 270, 274, 294, 355, 364, 366, 408, 425, 434–435, 437–439, 441, 452, 595, 597, 637, 687, 723–725, 727, 737, 745–746, 813–816, 907, 913, 919, 961, 999, 1031, 1048, 1059, 1116–1117, 1121–1123, 1147, 1171–1172, 1185, 1205, 1228, 1273–1274, 1312–1313, 1477, 1480, 1489, 1492, 1497, 1548–1549, 1590–1591, 1639, 1669, 1727, 1741, 1752
_orchestration_request_info.py	66	1	98%	162
_sequential.py	80	6	92%	48, 120, 131, 137, 185, 213
TOTAL	30444	3541	88%

Python Unit Test Overview

Tests	Skipped	Failures	Errors	Time
6122	30 💤	0 ❌	0 🔥	1m 35s ⏱️

Copilot

Pull request overview

This PR changes Python orchestration output semantics so workflow.as_agent().run(prompt) returns only the final “answer” (as an AgentResponse) while intermediate participant activity is surfaced via data events (instead of additional output events). It adds a single new knob (AgentExecutor(emit_intermediate_data=...)) and updates orchestrations, tests, and a sample to align with the clarified contract.

Changes:

Add AgentExecutor.emit_intermediate_data to emit parallel data events for AgentResponse / AgentResponseUpdate.
Update Sequential/Concurrent/GroupChat/Magentic/Handoff orchestrations to reserve output for terminal answers and use data for intermediate agent activity.
Update orchestration tests and a sample to validate/illustrate the new terminal vs intermediate event behavior.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
python/samples/03-workflows/agents/sequential_workflow_as_agent.py	Updates sample to reflect `as_agent()` returning only the final agent response and notes intermediate observation via `data`.
python/packages/orchestrations/tests/test_sequential.py	Rewrites sequential tests to assert terminal `output` is last agent only and intermediate activity is `data` when enabled.
python/packages/orchestrations/tests/test_magentic.py	Updates Magentic tests to expect `AgentResponse` terminal outputs and `data` events for intermediate updates.
python/packages/orchestrations/tests/test_handoff.py	Updates Handoff tests to reflect no synthetic terminal output and per-agent outputs as the stream result.
python/packages/orchestrations/tests/test_group_chat.py	Updates GroupChat tests to expect a single terminal `AgentResponse` from the orchestrator and intermediate agent updates as `data`.
python/packages/orchestrations/tests/test_concurrent.py	Updates Concurrent tests to expect default aggregator returns `AgentResponse` with assistant messages only (no user prompt).
python/packages/orchestrations/agent_framework_orchestrations/_sequential.py	Changes sequential wiring so the last agent executor is the workflow output executor; uses `data` events for intermediate participants.
python/packages/orchestrations/agent_framework_orchestrations/_orchestration_request_info.py	Plumbs `emit_intermediate_data` through `AgentApprovalExecutor` into the inner `AgentExecutor`.
python/packages/orchestrations/agent_framework_orchestrations/_magentic.py	Changes Magentic terminal output to `AgentResponse`; adds participant `emit_intermediate_data` plumbing; restricts output executor to orchestrator.
python/packages/orchestrations/agent_framework_orchestrations/_handoff.py	Removes terminal yield; relies on per-agent output events as the observable result.
python/packages/orchestrations/agent_framework_orchestrations/_group_chat.py	Changes GroupChat completion to yield an `AgentResponse`; uses participant `emit_intermediate_data`; restricts outputs to orchestrator.
python/packages/orchestrations/agent_framework_orchestrations/_concurrent.py	Changes default aggregator to yield `AgentResponse` (assistant replies only) and uses participant `emit_intermediate_data` for intermediate observation.
python/packages/orchestrations/agent_framework_orchestrations/_base_group_chat_orchestrator.py	Updates base orchestrator termination/max-round outputs to yield `AgentResponse` completion messages.
python/packages/core/tests/workflow/test_agent_executor.py	Adds core tests asserting `emit_intermediate_data` produces `data` events in both streaming and non-streaming modes.
python/packages/core/agent_framework/_workflows/_agent_executor.py	Implements `emit_intermediate_data` by emitting `WorkflowEvent.emit(...)/type='data'` alongside existing outputs.
python/packages/core/agent_framework/_workflows/_agent.py	Updates `WorkflowAgent` to consume `data` events carrying `AgentResponse` / `AgentResponseUpdate` as part of agent responses.

moonbox3

Automated Code Review

Reviewers: 3 | Confidence: 82%

✓ Correctness

This PR refactors orchestration output from list[Message] to AgentResponse and introduces emit_intermediate_data on AgentExecutor to surface intermediate participants via data events. The core mechanics are sound: output_executors is now always set (not conditioned on intermediate_outputs), intermediate agents emit data events, and the _agent.py adapter layer correctly processes both output and data events. The existing unresolved review comments remain valid (AgentApprovalExecutor as last participant in sequential, sample file stale imports/docstrings, data-event propagation through WorkflowExecutor, and docstring ordering in concurrent aggregator). I found no additional correctness bugs beyond those already raised.

✓ Security Reliability

This PR refactors orchestration outputs from list[Message] to AgentResponse and introduces emit_intermediate_data on AgentExecutor to surface intermediate participants as data events while reserving output events for the terminal answer. The design is clean and the output_executors / data-event split is well-implemented. The key security/reliability issues already identified in previous review comments (AgentApprovalExecutor as terminal participant producing no output, docstring ordering inaccuracy, data event forwarding in WorkflowExecutor) remain unresolved and still apply. One new concern: the handoff workflow removes all terminal yield_output calls, meaning a termination condition that fires before any agent speaks (e.g., from user input alone) will cause the workflow to go idle with zero output events — consumers relying on at least one output would see silent completion.

✓ Test Coverage

The PR introduces a significant refactoring of orchestration output contracts (from list[Message] to AgentResponse) and adds emit_intermediate_data wiring to surface per-agent responses as data events. Test coverage for sequential, group-chat, and magentic workflows is solid, with good new tests for non-streaming, streaming, intermediate outputs, and as_agent scenarios. However, there are notable gaps: ConcurrentBuilder's intermediate_outputs=True path is completely untested despite new wiring, the handoff async-termination test was weakened to a bare IDLE-state check, and sequential intermediate_outputs is only tested in non-streaming mode.

Automated review by moonbox3's agents

…on-outputs

1. Sample cleanup: Remove commented-out FoundryChatClient block and update prerequisites to reference OPENAI_CHAT_MODEL_ID instead of FOUNDRY_* vars. 2. Sequential approval output: Change _EndWithConversation.end_with_agent_executor_response from a no-op sink to yield response.agent_response. When the last participant is AgentApprovalExecutor (via with_request_info), _EndWithConversation is the output executor so the yield produces the terminal answer. When the last participant is a regular AgentExecutor, _EndWithConversation is not in output_executors so the yield is silently filtered out. 3. Forward data events through WorkflowExecutor: _process_workflow_result now also forwards 'data' events from sub-workflows so that emit_intermediate_data=True on AgentExecutor works correctly when wrapped in AgentApprovalExecutor. 4. Concurrent docstring: Update _AggregateAgentConversations docstring to say 'deterministic participant order' instead of 'completion order'. 5. Add test_concurrent_intermediate_outputs_emits_data_events verifying that ConcurrentBuilder(intermediate_outputs=True) emits per-participant data events alongside the single aggregated output event. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…outputs (microsoft#5301) Address PR review comments 2, 3, and 5: - Add test_sequential_request_info_last_participant_emits_output: Verifies that when the last participant is wrapped via with_request_info() (AgentApprovalExecutor), the workflow still emits a terminal output after approval, exercising the _EndWithConversation.end_with_agent_executor_response fallback path. - Add test_sequential_request_info_with_intermediate_outputs_emits_data_events: Verifies that emit_intermediate_data=True works correctly through AgentApprovalExecutor wrapping—WorkflowExecutor._process_result already forwards data events from sub-workflows, so intermediate agent responses surface as data events in the parent workflow. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…#5301) Update cast() calls in _group_chat.py and _magentic.py to use WorkflowContext[Never, AgentResponse] instead of the old WorkflowContext[Never, list[Message]], matching the updated method signatures in _base_group_chat_orchestrator.py. Fix _sequential.py _EndWithConversation.end_with_agent_executor_response to declare WorkflowContext[Any, AgentResponse] so yield_output accepts AgentResponse[None]. Fix _workflow_executor.py data event forwarding to handle nullable executor_id. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Extract event.data into a typed local variable before the isinstance check to avoid pyright narrowing it to AgentResponse[Unknown]. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…icrosoft#5301) Add pyright: ignore[reportMissingImports] to orjson imports that are already guarded by try/except ImportError, matching the existing pattern used elsewhere in the samples. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

eavanvalkenburg

small nit on the env var names, otherwise good to go

…on-outputs

Reverts the mistaken switch from FoundryChatClient to OpenAIChatClient in the sequential workflow as agent sample. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

… reasoning conversion Layered on top of the prior review-feedback work in this branch. Renames: - AgentExecutor.emit_intermediate_data -> emit_data_events (mechanical rename; orchestration semantics live at the orchestration layer, not the general-purpose executor). Forwarded through MagenticAgentExecutor, AgentApprovalExecutor, and all orchestration call sites. - HandoffAgentExecutor._check_terminate_and_yield -> _should_terminate (pure predicate; no longer yields anything). HandoffBuilder docstring rewritten to describe the new per-agent AgentResponse output contract. WorkflowAgent reasoning-content conversion: - Add _rewrite_text_to_reasoning(contents) and _msg_as_reasoning(msg) helpers; the as_agent() path now reframes text content from data events as text_reasoning Content blocks before merging into the AgentResponse. - Consumers iterate msg.contents and branch on content.type — same path they already use for Claude thinking and OpenAI reasoning. No new field on Message/AgentResponse/WorkflowEvent. - Streaming branch constructs fresh AgentResponseUpdate instances instead of mutating shared payloads (regression test added). - Helper _msg_maybe_reasoning consolidates the conditional rewrite at three call sites in the non-streaming conversion. Tests: - TestWorkflowAgentReasoningHelpers + TestWorkflowAgentDataEventReasoningConversion add 9 new tests covering helpers, non-streaming, streaming, mixed content, already-reasoning passthrough, and mutation-safety regression. - Updated test_sequential_as_agent_with_intermediate_outputs_includes_chain to assert text_reasoning content for intermediate agents.

…on-outputs

Eight callsites in _group_chat.py still cast to WorkflowContext[Never, AgentResponse] but the base orchestrator methods now accept the wider WorkflowContext[Never, AgentResponse | AgentResponseUpdate] (mode-aware yields). W_OutT is invariant, so the narrower cast is not assignable. Magentic was widened in the same commit; this catches the GroupChat callsites that were missed.

…soft#5553) These two integration tests have been failing in the merge queue across multiple unrelated PRs (5301, 5531). Both are marked `@pytest.mark.flaky` with 3 retries, but all attempts fail back-to-back. Skipping both with a reason pointing to microsoft#5553 so they can be fixed properly without continuing to block unrelated merges. - packages/foundry_hosting/tests/test_responses_int.py::TestOptions::test_temperature_and_max_tokens - packages/foundry/tests/foundry/test_foundry_embedding_client.py::TestFoundryEmbeddingIntegration::test_text_embedding_live Also includes a one-line uv.lock specifier-ordering normalization auto-applied by the poe-check pre-commit hook.

* Python: bump package versions for 1.2.2 release PATCH bump (1.2.1 -> 1.2.2) for the released cohort. Five PRs land in this window: - agent-framework-openai: fix file_search citations breaking the assistant- message history roundtrip (#5557) — drives the released-tier PATCH - agent-framework-orchestrations: [BREAKING] standardize orchestration terminal outputs as AgentResponse (#5301) - agent-framework-core, agent-framework-declarative: preserve Workflow.run() shared state across calls, accept list[Message] in declarative start executor, and coerce Enum values when serializing PowerFx symbols (#5531) - agent-framework-foundry-hosting: add hosted Durable Workflow support (#5531) - agent-framework-azure-contentunderstanding: new alpha package — Azure AI Content Understanding context provider (#4829) - dependencies: workspace package dependency refresh (#5555) Per lockstep convention, all 21 beta packages stamp 1.0.0b260429 and all 4 alpha packages (now including the new contentunderstanding) stamp 1.0.0a260429. Date stamp reflects 2026-04-29 Pacific. Every non-core package floor on agent-framework-core is raised to >=1.2.2; the new contentunderstanding package's stale >=1.0.0 floor is brought into line. Two follow-on fixes bundled to keep validate-dependency-bounds-test green at lowest-direct resolution: - Bump agent-framework-azure-contentunderstanding's azure-ai-content understanding lower bound from >=1.0.0 to >=1.0.1 (1.0.0 ships without proper typing — pyright reports 65 unknown-type errors) - Add pyright ignore comments to core/foundry/__init__.pyi for the new alpha package's type-stub imports, since alpha packages are not in core's [all] extra and therefore aren't installed at lowest-direct * Python: add #5552 to 1.2.2 CHANGELOG Add the streaming-span observability fix to the Fixed section. PR is on upstream/main but not yet pulled into origin/main; the code itself will land via the PR merge. * Python: address PR #5561 review feedback on dependency bounds Two packaging fixes flagged in review: 1. agent-framework-azure-contentunderstanding: add agent-framework-foundry as a runtime dependency. The package's README directs users to `pip install agent-framework-azure-contentunderstanding --pre` and the basic example imports `FoundryChatClient` from `agent_framework.foundry`, so the documented install path was failing with ImportError. Pulling agent-framework-foundry into deps makes the advertised entry path self-contained. 2. agent-framework-foundry: bump agent-framework-openai lower bound from >=1.1.0 to >=1.2.2,<2. Foundry imports private modules from agent_framework_openai (`_chat_client.py:22`, `_agent.py:34`), so resolvers were free to pair foundry==1.2.2 with older OpenAI versions that lack this release's coordinated Responses/history fix. Lockstep the floor with the released cohort to prevent mismatched installs. Both changes pass `validate-dependency-bounds-test` lower + upper at their respective packages.

Fix orchestration outputs so as_agent() returns the final answer only…

5a5133c

…. Align other orchestration outputs

moonbox3 self-assigned this Apr 16, 2026

Copilot AI review requested due to automatic review settings April 16, 2026 06:28

moonbox3 added the python label Apr 16, 2026

Copilot started reviewing on behalf of moonbox3 April 16, 2026 06:29 View session

Copilot AI reviewed Apr 16, 2026

View reviewed changes

moonbox3 commented Apr 16, 2026

View reviewed changes

Comment thread python/packages/orchestrations/tests/test_concurrent.py

Copilot and others added 7 commits April 16, 2026 08:39

Merge remote-tracking branch 'upstream/main' into improve-orchestrati…

240e307

…on-outputs

Fix pyright reportUnknownVariableType in _agent.py (microsoft#5301)

675afe0

Extract event.data into a typed local variable before the isinstance check to avoid pyright narrowing it to AgentResponse[Unknown]. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Address review feedback for microsoft#5301: review comment fixes

28cf71f

moonbox3 requested review from TaoChenOSU, eavanvalkenburg and giles17 April 16, 2026 10:21

eavanvalkenburg approved these changes Apr 16, 2026

View reviewed changes

Comment thread python/samples/03-workflows/agents/sequential_workflow_as_agent.py Outdated

Comment thread python/samples/03-workflows/agents/sequential_workflow_as_agent.py Outdated

Copilot added 2 commits April 16, 2026 10:46

Merge remote-tracking branch 'upstream/main' into improve-orchestrati…

e3057e1

…on-outputs

Address review feedback for microsoft#5301: review comment fixes

09a12fe

moonbox3 commented Apr 16, 2026

View reviewed changes

Comment thread python/samples/03-workflows/agents/sequential_workflow_as_agent.py

Revert sequential_workflow_as_agent sample to FoundryChatClient

cec1993

Reverts the mistaken switch from FoundryChatClient to OpenAIChatClient in the sequential workflow as agent sample. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

lokitoth added the needs_port_to_dotnet Indicate this item needs to also be done for .Net label Apr 16, 2026

TaoChenOSU reviewed Apr 16, 2026

View reviewed changes

Comment thread python/packages/core/agent_framework/_workflows/_agent_executor.py Outdated

TaoChenOSU reviewed Apr 16, 2026

View reviewed changes

Comment thread python/packages/core/agent_framework/_workflows/_agent.py Outdated

TaoChenOSU reviewed Apr 16, 2026

View reviewed changes

Comment thread python/packages/core/agent_framework/_workflows/_agent.py Outdated

TaoChenOSU reviewed Apr 16, 2026

View reviewed changes

Comment thread python/packages/orchestrations/agent_framework_orchestrations/_sequential.py Outdated

Scope to agent output semantics only

1a4c975

moonbox3 changed the title ~~Python: [BREAKING] Python: Fix orchestration outputs so as_agent() returns the final answer only. Align other orchestration outputs.~~ Python: [BREAKING] Standardize orchestration terminal outputs as AgentResponse Apr 28, 2026

moonbox3 added 3 commits April 28, 2026 13:57

yield AgentResponseUpdate streaming, AgentResponse non-streaming

96cc455

Merge remote-tracking branch 'upstream/main' into improve-orchestrati…

e13fbfe

…on-outputs

TaoChenOSU approved these changes Apr 28, 2026

View reviewed changes

Comment thread python/packages/orchestrations/agent_framework_orchestrations/_concurrent.py

Comment thread python/packages/orchestrations/agent_framework_orchestrations/_concurrent.py

moonbox3 added this pull request to the merge queue Apr 28, 2026