AiCore Project

✨ AiCore is a comprehensive framework for integrating various language models and embedding providers with a unified interface. It supports both synchronous and asynchronous operations for generating text completions and embeddings, featuring:

🔌 Multi-provider support: OpenAI, Mistral, Groq, Gemini, NVIDIA, and more 🤖 Reasoning augmentation: Enhance traditional LLMs with reasoning capabilities 📊 Observability: Built-in monitoring and analytics 💰 Token tracking: Detailed usage metrics and cost tracking ⚡ Flexible deployment: Chainlit, FastAPI, and standalone script support 🛠️ MCP Integration: Connect to Model Control Protocol servers via tool calling 🖥️ Claude Code provider: Use your Claude subscription locally or remotely via the Claude Agents Python SDK — no API key required

Quickstart

pip install git+https://github.com/BrunoV21/AiCore

or

pip install git+https://github.com/BrunoV21/AiCore.git#egg=core-for-ai[all]

or

pip install core-for-ai[all]

Make your First Request

Sync

from aicore.llm import Llm
from aicore.llm.config import LlmConfig
import os

llm_config = LlmConfig(
  provider="openai",
  model="gpt-4o",
  api_key="super_secret_openai_key"
)

llm = Llm.from_config(llm_config)

# Generate completion
response = llm.complete("Hello, how are you?")
print(response)

Async

from aicore.llm import Llm
from aicore.llm.config import LlmConfig
import os

async def main():
  llm_config = LlmConfig(
    provider="openai",
    model="gpt-4o",
    api_key="super_secret_openai_key"
  )

  llm = Llm.from_config(llm_config)

  # Generate completion
  response = await llm.acomplete("Hello, how are you?")
  print(response)

if __name__ == "__main__":
  asyncio.run(main())

more examples available at examples/ and docs/exampes/

Key Features

Multi-provider Support

LLM Providers:

Anthropic
OpenAI
Mistral
Groq
Gemini
NVIDIA
OpenRouter
DeepSeek
Claude Code (local — via Claude Agents Python SDK, no API key required)
Remote Claude Code (remote — connects to a aicore-proxy-server over HTTP)

Embedding Providers:

OpenAI
Mistral
Groq
Gemini
NVIDIA

Observability Tools:

Operation tracking and metrics collection
Interactive dashboard for visualization
Token usage and latency monitoring
Cost tracking

MCP Integration:

Connect to multiple MCP servers simultaneously
Automatic tool discovery and calling
Support for WebSocket, SSE, and stdio transports

To configure the application for testing, you need to set up a config.yml file with the necessary API keys and model names for each provider you intend to use. The CONFIG_PATH environment variable should point to the location of this file. Here's an example of how to set up the config.yml file:

# config.yml
embeddings:
  provider: "openai" # or "mistral", "groq", "gemini", "nvidia"
  api_key: "your_openai_api_key"
  model: "text-embedding-3-small" # Optional

llm:
  provider: "openai" # or "mistral", "groq", "gemini", "nvidia"
  api_key: "your_openai_api_key"
  model: "gpt-o4" # Optional
  temperature: 0.1
  max_tokens: 1028
  reasonning_effort: "high"
  mcp_config: "./mcp_config.json" # Path to MCP configuration
  max_tool_calls_per_response: 3 # Optional limit on tool calls

config examples for the multiple providers are included in the config dir

MCP Integration Example

from aicore.llm import Llm
from aicore.config import Config
import asyncio

async def main():
    # Load configuration with MCP settings
    config = Config.from_yaml("./config/config_example_mcp.yml")
    
    # Initialize LLM with MCP capabilities
    llm = Llm.from_config(config.llm)
    
    # Make async request that can use MCP-connected tools
    response = await llm.acomplete(
        "Search for latest news about AI advancements",
        system_prompt="Use available tools to gather information"
    )
    print(response)

asyncio.run(main())

Example MCP configuration (mcp_config.json):

{
  "mcpServers": {
    "search-server": {
      "transport_type": "ws",
      "url": "ws://localhost:8080",
      "description": "WebSocket server for search functionality"
    },
    "data-server": {
      "transport_type": "stdio",
      "command": "python",
      "args": ["data_server.py"],
      "description": "Local data processing server"
    },
    "brave-search": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-brave-search"
      ],
      "env": {
        "BRAVE_API_KEY": "SUPER-SECRET-BRAVE-SEARCH-API-KEY"
      }
    }
  }
}

Claude Code Provider

AiCore supports routing completions through your Claude subscription via the Claude Agents Python SDK. No Anthropic API key is required — auth is handled entirely by the Claude Code CLI. This is exposed through two providers and an optional proxy server:

Component	Description
`claude_code`	Local provider — runs the Claude Code CLI on the same machine as AiCore
`remote_claude_code`	Remote provider — connects over HTTP to a `aicore-proxy-server` instance
`aicore-proxy-server`	Proxy server — wraps the local CLI as a FastAPI SSE service, shareable over a network

Both providers share the same acomplete() / complete() interface and emit identical tool-call streaming events — you can switch between them with a single config change.

Local Provider (`claude_code`)

Runs claude-agent-sdk directly on the machine where AiCore is executing.

Prerequisites

# 1. Install the Claude Code CLI (requires Node.js 18+)
npm install -g @anthropic-ai/claude-code

# 2. Authenticate once
claude login

# 3. Install AiCore (the Python SDK is included automatically)
pip install core-for-ai

Quickstart

from aicore.llm import Llm
from aicore.llm.config import LlmConfig

config = LlmConfig(
    provider="claude_code",
    model="claude-sonnet-4-5-20250929",
    # No api_key needed — auth is handled by the CLI
)

llm = Llm.from_config(config)
response = await llm.acomplete("List all Python files in this project")
print(response)

Config File

# config/config_example_claude_code.yml
llm:
  provider: "claude_code"
  model: "claude-sonnet-4-5-20250929"

  # Optional
  permission_mode: "bypassPermissions"   # default — all tools allowed
  cwd: "/path/to/your/project"           # working directory for the CLI
  max_turns: 10                           # limit agentic turns
  mcp_config: "./mcp_config.json"   # pass through an MCP config file
  cli_path: "/usr/local/bin/claude"      # override if CLI is not on PATH
  allowed_tools:
    - "Read"
    - "Write"
    - "Bash"

Proxy Server (`aicore-proxy-server`)

The proxy server wraps claude-agent-sdk in a FastAPI SSE service so Claude Code can be accessed remotely over HTTP. Useful when:

The Claude Code CLI is authenticated on a different machine (e.g. a dev box, a server, or WSL)
You want to share a single Claude subscription across multiple AiCore clients
Your AiCore workload runs in a container or cloud environment that cannot run the CLI directly

Installation (server-side only)

# Install AiCore with the claude-server extras
pip install core-for-ai[claude-server]

# Also install the Claude Code CLI and authenticate
npm install -g @anthropic-ai/claude-code
claude login

The [claude-server] extra installs fastapi, uvicorn[standard], and python-dotenv. pyngrok is optional and only needed for the ngrok tunnel mode.

Starting the Server

# Minimal — binds to 127.0.0.1:8080, prompts for tunnel choice interactively
aicore-proxy-server

# Fully configured
aicore-proxy-server \
  --host 0.0.0.0 \
  --port 8080 \
  --token my-secret-token \
  --tunnel none \
  --cwd /path/to/project \
  --log-level INFO

# Or via Python module
python -m aicore.scripts.claude_code_proxy_server --port 8080 --tunnel none

On first run the bearer token is auto-generated and printed. Set CLAUDE_PROXY_TOKEN in your environment or .env file to reuse it across restarts, or pass --token explicitly.

CLI Reference

Flag	Default	Description
`--host`	`127.0.0.1`	Bind address
`--port`	`8080`	TCP port
`--token`	(auto-generated)	Bearer token; also reads `CLAUDE_PROXY_TOKEN` env var
`--tunnel`	(prompt)	`none` / `ngrok` / `cloudflare` / `ssh`
`--tunnel-port`	same as `--port`	Remote port for SSH tunnels
`--cwd`	(unrestricted)	Force a working directory for all Claude sessions
`--allowed-cwd-paths`	(any)	Whitelist of `cwd` values clients may request
`--log-level`	`INFO`	`DEBUG` / `INFO` / `WARNING` / `ERROR`
`--cors-origins`	`*`	Allowed CORS origins

Tunnel Support

When --tunnel is omitted the server prompts interactively at startup.

Mode	Requirement	Notes
`none`	—	Local network only
`ngrok`	`pip install pyngrok`	Auth token is stored in the OS credential store on first run and loaded automatically on subsequent runs
`cloudflare`	`cloudflared` binary on PATH	Quick tunnel, ephemeral URL
`ssh`	SSH access to a VPS	Prints the `ssh -R` command; no extra software needed

API Endpoints

Method	Path	Auth	Description
`GET`	`/health`	None	Server status, uptime, Claude CLI version, active stream count
`GET`	`/capabilities`	Bearer	Supported options and server-enforced defaults
`POST`	`/query`	Bearer	Stream a `claude-agent-sdk` query as SSE
`DELETE`	`/query/{session_id}`	Bearer	(501 stub — reserved for future WebSocket-based cancellation)

Remote Provider (`remote_claude_code`)

Connects AiCore to a running aicore-proxy-server over HTTP SSE. The remote provider reconstructs the SDK message stream locally, giving the same acomplete() / complete() interface as the local provider — no Claude Code CLI needed on the client side.

Prerequisites (client-side)

pip install core-for-ai   # no CLI or claude-server extras required

The proxy server must be running and reachable before instantiating the provider (a GET /health check is performed automatically at startup, controllable via skip_health_check).

Quickstart

from aicore.llm import Llm
from aicore.llm.config import LlmConfig

config = LlmConfig(
    provider="remote_claude_code",
    model="claude-sonnet-4-5-20250929",
    base_url="http://your-proxy-host:8080",   # or a tunnel URL
    api_key="your_proxy_token",               # CLAUDE_PROXY_TOKEN from server startup
)

llm = Llm.from_config(config)
response = await llm.acomplete("Summarise this codebase")
print(response)

Config File

# config/config_example_remote_claude_code.yml
llm:
  provider: "remote_claude_code"
  model: "claude-sonnet-4-5-20250929"
  base_url: "http://your-proxy-host:8080"   # or the ngrok / cloudflare tunnel URL
  api_key: "your_proxy_token"               # CLAUDE_PROXY_TOKEN printed at server startup

  # Optional — forwarded to the proxy server
  permission_mode: "bypassPermissions"
  cwd: "/path/to/project"                   # must be in server's --allowed-cwd-paths
  max_turns: 10
  allowed_tools:
    - "Bash"
    - "Read"
    - "Write"

  # Skip the GET /health connectivity check at startup
  skip_health_check: false

Tool Call Streaming & Callbacks

Both claude_code and remote_claude_code emit identical tool-call events:

def on_tool_event(event: dict):
    if event["stage"] == "started":
        print(f"→ Calling tool: {event['tool_name']}")
    elif event["stage"] == "concluded":
        status = "✗" if event["is_error"] else "✓"
        print(f"{status} Tool finished: {event['tool_name']}")

llm.tool_callback = on_tool_event

response = await llm.acomplete("Find all TODO comments in the codebase")

TOOL_CALL_START_TOKEN / TOOL_CALL_END_TOKEN are also emitted via stream_handler, so any existing stream consumer works without changes.

Supported Models

Model	Max Tokens	Context Window
`claude-sonnet-4-5-20250929`	64 000	200 000
`claude-opus-4-6`	32 000	200 000
`claude-haiku-4-5-20251001`	64 000	200 000
`claude-3-7-sonnet-latest`	64 000	200 000
`claude-3-5-sonnet-latest`	8 192	200 000

Note: temperature, max_tokens, and api_key are ignored by both providers — the Claude Code CLI controls model parameters internally. Cost is reported from ResultMessage.total_cost_usd rather than computed from a pricing table.

Usage

Language Models

You can use the language models to generate text completions. Below is an example of how to use the MistralLlm provider:

from aicore.llm.config import LlmConfig
from aicore.llm.providers import MistralLlm

config = LlmConfig(
    api_key="your_api_key",
    model="your_model_name",
    temperature=0.7,
    max_tokens=100
)

mistral_llm = MistralLlm.from_config(config)
response = mistral_llm.complete(prompt="Hello, how are you?")
print(response)

Loading from a Config File

To load configurations from a YAML file, set the CONFIG_PATH environment variable and use the Config class to load the configurations. Here is an example:

from aicore.config import Config
from aicore.llm import Llm
import os

if __name__ == "__main__":
    os.environ["CONFIG_PATH"] = "./config/config.yml"
    config = Config.from_yaml()
    llm = Llm.from_config(config.llm)
    llm.complete("Once upon a time, there was a")

Make sure your config.yml file is properly set up with the necessary configurations.

Observability

AiCore includes a comprehensive observability module that tracks:

Request/response metadata
Token usage (prompt, completion, total)
Latency metrics (response time, time-to-first-token)
Cost estimates (based on provider pricing)
Tool call statistics (for MCP integrations)

Dashboard Features

Key metrics tracked:

Requests per minute
Average response time
Token usage trends
Error rates
Cost projections

from aicore.observability import ObservabilityDashboard

dashboard = ObservabilityDashboard(storage="observability_data.json")
dashboard.run_server(port=8050)

Advanced Usage

Reasoner Augmented Config

AiCore also contains native support to augment traditional Llms with reasoning capabilities by providing them with the thinking steps generated by an open-source reasoning capable model, allowing it to generate its answers in a Reasoning Augmented way.

This can be usefull in multiple scenarios, such as:

ensure your agentic systems still work with the propmts you have crafted for your favourite llms while augmenting them with reasoning steps
direct control for how long you want your reasoner to reason (via max_tokens param) and how creative it can be (reasoning temperature decoupled from generation temperature) without compromising generation settings

To leverage the reasoning augmentation just introduce one of the supported llm configs into the reasoner field and AiCore handles the rest

# config.yml
embeddings:
  provider: "openai" # or "mistral", "groq", "gemini", "nvidia"
  api_key: "your_openai_api_key"
  model: "your_openai_embedding_model" # Optional

llm:
  provider: "mistral" # or "openai", "groq", "gemini", "nvidia"
  api_key: "your_mistral_api_key"
  model: "mistral-small-latest" # Optional
  temperature: 0.6
  max_tokens: 2048
  reasoner:
    provider: "groq" # or openrouter or nvidia
    api_key: "your_groq_api_key"
    model: "deepseek-r1-distill-llama-70b" # or "deepseek/deepseek-r1:free" or "deepseek/deepseek-r1"
    temperature: 0.5
    max_tokens: 1024

Built with AiCore

Reasoner4All

A Hugging Face Space showcasing reasoning-augmented models

⏮ GitRecap

Instant summaries of Git activity
🌐 Live App
📦 GitHub Repository

🌀 CodeTide & AgentTide Integration

📦 GitHub Repository

CodeTide is a fully local, privacy-first tool for parsing and understanding Python codebases using symbolic, structural analysis—no LLMs, no embeddings, just fast and deterministic code intelligence. It enables developers and AI agents to retrieve precise code context, visualize project structure, and generate atomic code changes with confidence.

AgentTide is a next-generation, precision-driven software engineering agent built on top of CodeTide. AgentTide leverages CodeTide’s symbolic code understanding to plan, generate, and apply high-quality code patches—always with full context and requirements fidelity. You can interact with AgentTide via a conversational CLI or a beautiful web UI.

Live Demo: Try AgentTide on Hugging Face Spaces: https://mclovinittt-agenttidedemo.hf.space/

AiCore was used to make LLM calls within AgentTide, enabling seamless integration between local code analysis and advanced language models. This combination empowers AgentTide to deliver context-aware, production-ready code changes—always under your control.

Future Plans

Extended Provider Support: Additional LLM and embedding providers
Add support for Speech: Integrate text2speech and speech to text objects with usage and observability4

Documentation

For complete documentation, including API references, advanced usage examples, and configuration guides, visit:

📖 Official Documentation Site

License

This project is licensed under the Apache 2.0 License.

Name		Name	Last commit message	Last commit date
Latest commit History 724 Commits
.github/workflows		.github/workflows
aicore		aicore
config		config
docs		docs
examples		examples
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
llms.txt		llms.txt
mcp_config_example.json		mcp_config_example.json
requirements-claude-server.txt		requirements-claude-server.txt
requirements-dashboard.txt		requirements-dashboard.txt
requirements-sql.txt		requirements-sql.txt
requirements.txt		requirements.txt
setup.py		setup.py

License

BrunoV21/AiCore

Folders and files

Latest commit

History

Repository files navigation

AiCore Project

Quickstart

Make your First Request

Sync

Async

Key Features

Multi-provider Support

MCP Integration Example

Claude Code Provider

Local Provider (claude_code)

Prerequisites

Quickstart

Config File

Proxy Server (aicore-proxy-server)

Installation (server-side only)

Starting the Server

CLI Reference

Tunnel Support

API Endpoints

Remote Provider (remote_claude_code)

Prerequisites (client-side)

Quickstart

Config File

Tool Call Streaming & Callbacks

Supported Models

Usage

Language Models

Loading from a Config File

Observability

Dashboard Features

Advanced Usage

Built with AiCore

Reasoner4All

⏮ GitRecap

🌀 CodeTide & AgentTide Integration

Future Plans

Documentation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 36

Languages

Local Provider (`claude_code`)

Proxy Server (`aicore-proxy-server`)

Remote Provider (`remote_claude_code`)