Skip to content

fix: strict mode hard system prompt against hallucinated tool output#103

Merged
VforVitorio merged 4 commits intodevfrom
fix/strict-mode-hard-system-prompt
Apr 7, 2026
Merged

fix: strict mode hard system prompt against hallucinated tool output#103
VforVitorio merged 4 commits intodevfrom
fix/strict-mode-hard-system-prompt

Conversation

@VforVitorio
Copy link
Copy Markdown
Owner

...

The ``strict`` permission mode advertised itself as "no tools — pure
chat only" but ``_run_turn`` still passed the full wrapped tool list to
``model.act()``.  The model saw every tool schema and could emit calls
that the runtime executed silently, defeating the point of the mode.

Strict is now enforced at the SDK boundary: when ``self._mode`` is
``strict``, the ``tools`` kwarg sent to ``model.act()`` is an empty
list.  The model literally cannot see the tool schemas, so there is
nothing to call.  Ask and auto modes continue to receive the full
wrapped list.

Adds two regression tests in ``tests/test_agent/test_core.py``:

- ``test_run_turn_strict_mode_sends_empty_tools`` — asserts ``tools=[]``
  when ``agent._mode = "strict"``.
- ``test_run_turn_non_strict_modes_pass_tools`` — asserts ask/auto
  still receive the full tool list so the fix does not silently
  strip tools from the other modes.

Also fixes the README description which claimed strict was "read-only"
— it was never read-only, it was tool-less-in-name-only.  Now the
description matches runtime behaviour.

Closes #99.
The first attempt at fixing #99 passed ``tools=[]`` to ``model.act()``,
but the LM Studio SDK rejects that with
``LMStudioValueError: Tool using actions require at least one tool to
be defined.`` — a crash at runtime the moment the user asked anything
in strict mode.

Strict now routes through ``model.respond()`` — the pure-chat SDK
primitive that has no tool concept at all — so the model never sees
a tool schema and cannot emit a tool call regardless of
hallucination.  The callback shape (``on_message``,
``on_prediction_fragment``) is identical to ``act()`` so the spinner
and token counter reuse the same helpers.  ``PredictionResult.stats``
is appended to the per-round ``stats_capture`` list so
``_build_stats_line`` works unchanged, and elapsed time is measured
manually via ``time.monotonic()`` because ``PredictionResult`` (unlike
``ActResult``) does not expose ``total_time_seconds``.

The existing regression test
``test_run_turn_strict_mode_sends_empty_tools`` is replaced by
``test_run_turn_strict_mode_uses_respond_not_act``, which pins the
routing: ``respond`` is awaited once, ``act`` is never awaited, and
``tools`` is not passed to ``respond`` at all.  The ask/auto
regression test is renamed to make the SDK-path distinction explicit.
``model.act()`` invokes ``on_prediction_fragment(fragment, round_index)``
with two positional args, but ``model.respond()`` (used in strict mode)
calls it with just one — a signature mismatch the SDK does not
document anywhere except in ``json_api.py:1486``.  The previous
definition required both args, so every prediction fragment raised
``TypeError: missing 1 required positional argument: '_round_index'``
inside the SDK's ``handle_rx_event`` and the turn silently failed.

Giving ``_round_index`` a default lets a single function serve both
SDK paths.

Adds a fragment-firing side effect to the respond mock in
``_make_mock_respond_model`` so the existing
``test_run_turn_strict_mode_uses_respond_not_act`` test actually
exercises the 1-arg callback shape and would fail again if the
default is removed.
@VforVitorio VforVitorio merged commit 1bfcb32 into dev Apr 7, 2026
6 checks passed
@VforVitorio VforVitorio deleted the fix/strict-mode-hard-system-prompt branch April 7, 2026 09:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant