Skip to content

Migrate to litellm for model compatability#24

Open
trevormells wants to merge 10 commits intoPickle-Pixel:mainfrom
trevormells:mmigrate_to_litellm
Open

Migrate to litellm for model compatability#24
trevormells wants to merge 10 commits intoPickle-Pixel:mainfrom
trevormells:mmigrate_to_litellm

Conversation

@trevormells
Copy link
Copy Markdown

Summary

This PR migrates ApplyPilot’s LLM layer to a LiteLLM-based adapter and standardizes provider/model configuration across CLI, wizard, docs, and runtime checks. It reduces provider-specific logic, adds Anthropic support, and tightens test coverage around LLM resolution/client behavior.


What Changed

  • Replaced the custom HTTP-based LLM implementation with a thin LiteLLM wrapper in llm.py.
  • Added a single resolve_llm_config() contract for provider/model/api key resolution.
  • Expanded provider support:
    • GEMINI_API_KEY
    • OPENAI_API_KEY
    • ANTHROPIC_API_KEY
    • LLM_URL (OpenAI-compatible local endpoint)
    • LLM_API_KEY (generic key fallback)
  • Standardized model semantics:
    • Provider-prefixed models supported (e.g. openai/gpt-4o-mini, gemini-3.0-flash)
    • Inference order for provider defaults when LLM_MODEL is not set
  • Updated LLM call sites to use client.chat(..., max_output_tokens=...) and removed ask() usage.
  • Updated applypilot init flow to allow saving multiple provider credentials and explicit LLM_MODEL.
  • Updated doctor/tier checks and failure messaging to match the new config contract.
  • Updated docs and .env.example for new provider options and model format.
  • Added optional Gemini smoke test + pytest marker config.
  • Added unit tests for LLM config resolution and LiteLLM client request behavior.

Tests

Added

  • test_llm_resolution.py
  • test_llm_client.py
  • test_gemini_smoke.py (optional smoke: @pytest.mark.smoke)

Suggested commands

pytest -q tests/test_llm_resolution.py tests/test_llm_client.py
GEMINI_API_KEY=... pytest -m smoke -q tests/test_gemini_smoke.py

Notes

  • Runtime LLM routing now follows LLM_MODEL provider prefix when multiple providers are configured.
  • Local OpenAI-compatible endpoints are supported via LLM_URL (with optional LLM_API_KEY).
  • Default model selections were refreshed (Gemini/OpenAI/Anthropic/local).

@trevormells
Copy link
Copy Markdown
Author

addresses a wide range of model compatibility issues that have been surfaced in issues
#19
#4
#9
#16
@Pickle-Pixel

rothnic added a commit to rothnic/ApplyPilot that referenced this pull request Feb 27, 2026
rothnic added a commit to rothnic/ApplyPilot that referenced this pull request Feb 27, 2026
- Add AgentBackend abstraction for Claude and OpenCode
- Implement backend detection and preference logic
- Add MCP server management for both backends
- Maintain compatibility with PR Pickle-Pixel#24 LiteLLM integration
- Update scoring/tailoring/wizard to use new backend system
trevormells and others added 2 commits February 27, 2026 15:10
Adds LLM_STREAMING_MODE environment variable to enable streaming mode
for LLM proxies that require it. When enabled, uses LiteLLM with
stream=True and accumulates chunks into plain text response.
@rothnic
Copy link
Copy Markdown

rothnic commented Mar 2, 2026

I created a PR to incorporate a fix I required to make this litellm branch work for me. I've been testing the combination of this litellm migration, my opencode support, greenhouse api support, and an improved iterative tailoring workflow in my dev integration branch. Trying to extract each feature out separately.

rothnic added a commit to rothnic/ApplyPilot that referenced this pull request Mar 2, 2026
rothnic added a commit to rothnic/ApplyPilot that referenced this pull request Mar 2, 2026
rothnic added a commit to rothnic/ApplyPilot that referenced this pull request Mar 2, 2026
rothnic added a commit to rothnic/ApplyPilot that referenced this pull request Mar 2, 2026
- Add AgentBackend abstraction for Claude and OpenCode
- Implement backend detection and preference logic
- Add MCP server management for both backends
- Maintain compatibility with PR Pickle-Pixel#24 LiteLLM integration
- Update scoring/tailoring/wizard to use new backend system
rothnic added a commit to rothnic/ApplyPilot that referenced this pull request Mar 2, 2026
rothnic added a commit to rothnic/ApplyPilot that referenced this pull request Mar 2, 2026
fix(llm): Add LLM_STREAMING_MODE option for custom endpoints
madisonrickert added a commit to madisonrickert/ApplyPilot that referenced this pull request Mar 12, 2026
tariks added a commit to tariks/ApplyPilot that referenced this pull request Apr 1, 2026
Issue Pickle-Pixel#4 (upstream): Gemini 2.5 thinking-token models silently consume
the max_tokens budget before generating output, causing the tailoring
stage to exhaust all retries with truncated/empty JSON. Community
confirmed fix: raise limits substantially.
- tailor.py: 2048 -> 8192 for generation, 512 -> 1024 for judge
- These were also too low for long academic/research CVs added in the
  previous commit.

From upstream PR Pickle-Pixel#24 (selective, without the full LiteLLM migration):
- llm.py: add ANTHROPIC_API_KEY detection and _chat_native_anthropic()
  handler using the Anthropic Messages API format (x-api-key header,
  top-level system field, content[0]["text"] response extraction)
- config.py: include ANTHROPIC_API_KEY in get_tier() and check_tier()
- .env.example: document ANTHROPIC_API_KEY option
- cli.py: route per-attempt tailor/cover logs to ~/.applypilot/logs/
  instead of the terminal (reduces noise; details still available on disk)

Skipped from PR Pickle-Pixel#24: full LiteLLM migration (tight ~=1.63.0 pin on a
fast-moving package, replaces working custom HTTP logic unnecessarily).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@pgordalina
Copy link
Copy Markdown

I created a PR to incorporate a fix I required to make this litellm branch work for me. I've been testing the combination of this litellm migration, my opencode support, greenhouse api support, and an improved iterative tailoring workflow in my dev integration branch. Trying to extract each feature out separately.

@rothnic thanks for this. Sort of new to this world but with some programming experience. How does it work to get a new version of ApplyPilot with your changes? Do we need to wait for your PR to be accepted and merged into a new release version?

@Deg5112
Copy link
Copy Markdown

Deg5112 commented Apr 17, 2026

Hi Team,

Can we get the fix merged here? Also having the same issue, thx!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants