BofAI · Will-Guan · Mar 18, 2026 · Mar 15, 2026 · Mar 15, 2026 · Mar 15, 2026
diff --git a/docs/llmservice/api/API.md b/docs/llmservice/api/API.md
@@ -0,0 +1,173 @@
+# BANK OF AI LLM API (OpenAI Compatible)
+OpenAPI spec for /v1/models and /v1/chat/completions (OpenAI format).
+
+## Version: 1.0
+
+**Schemes:** https
+
+**Host:** api.bankofai.io
+
+---
+### /v1/chat/completions
+
+#### POST
+##### Summary
+
+Create chat completion (OpenAI compatible)
+
+##### Description
+
+Chat completion. Auth: Bearer token. Non-stream: JSON with choices[].content. Stream: SSE chunks with choices[].delta.content.
+
+##### Parameters
+
+| Name | Located in | Description | Required | Schema |
+| ---- | ---------- | ----------- | -------- | ------ |
+| Authorization | header | Bearer &lt;token&gt;, e.g. Bearer sk-xxx | Yes | string |
+| body | body | Request body (model, messages required; stream, max_tokens, temperature, top_p, stop, n optional) | Yes | [main.ChatCompletionsRequest](#mainchatcompletionsrequest) |
+
+##### Responses
+
+| Code | Description | Schema |
+| ---- | ----------- | ------ |
+| 200 | Non-stream: choices[].content. Stream (SSE): each chunk is ChatCompletionsStreamChunk with choices[].delta.content. | [main.ChatCompletionsResponse](#mainchatcompletionsresponse) |
+| 401 | Authentication failed | object |
+
+---
+### /v1/models
+
+#### GET
+##### Summary
+
+List models (OpenAI compatible)
+
+##### Description
+
+List available models. Auth: Bearer token. Response: object, success, data.
+
+##### Parameters
+
+| Name | Located in | Description | Required | Schema |
+| ---- | ---------- | ----------- | -------- | ------ |
+| Authorization | header | Bearer &lt;token&gt;, e.g. Bearer sk-xxx | Yes | string |
+
+##### Responses
+
+| Code | Description | Schema |
+| ---- | ----------- | ------ |
+| 200 | object: list; success: true; data: array of { id, object, created, owned_by } | [main.V1ModelsResponse](#mainv1modelsresponse) |
+| 401 | Authentication failed | object |
+
+---
+### Models
+
+#### main.ChatChoice
+
+| Name | Type | Description | Required |
+| ---- | ---- | ----------- | -------- |
+| content | string |  | No |
+| finish_reason | string | *Example:* `"stop"` | No |
+| index | integer |  | No |
+
+#### main.ChatCompletionsRequest
+
+| Name | Type | Description | Required |
+| ---- | ---- | ----------- | -------- |
+| frequency_penalty | number | FrequencyPenalty: -2.0 to 2.0. Penalize repeated tokens. Default 0. | No |
+| max_tokens | integer | MaxTokens: maximum number of tokens that can be generated in the completion. | No |
+| messages | [ [main.ChatMessage](#mainchatmessage) ] | Messages: list of messages in the conversation. Required. | No |
+| model | string | Model: ID of the model to use (e.g. gpt-4). Required.<br/>*Example:* `"gpt-4"` | No |
+| n | integer | N: how many chat completion choices to generate. Default 1. | No |
+| presence_penalty | number | PresencePenalty: -2.0 to 2.0. Penalize tokens that appear in the text so far. Default 0. | No |
+| response_format | [main.ChatResponseFormat](#mainchatresponseformat) | ResponseFormat: specify output format: { "type": "text" } or { "type": "json_object" } or json_schema. | No |
+| seed | integer | Seed: random seed for deterministic sampling (if supported by model). | No |
+| stop |  | Stop: up to 4 sequences where the API will stop generating. String or array of strings. | No |
+| stream | boolean | Stream: if true, partial message deltas will be sent as server-sent events. Default false. | No |
+| temperature | number | Temperature: sampling temperature between 0 and 2. Higher = more random. Default 1. | No |
+| tool_choice |  | ToolChoice: "none" \| "auto" \| { "type": "function", "function": { "name": "..." } }. Controls which tool(s) to call. | No |
+| tools | [ [main.ChatTool](#mainchattool) ] | Tools: list of tools the model may call. Each has type "function" and function { name, description?, parameters? }. | No |
+| top_p | number | TopP: nucleus sampling: consider tokens with top_p probability mass. Default 1. | No |
+| user | string | User: optional end-user identifier for abuse monitoring. | No |
+
+#### main.ChatCompletionsResponse
+
+| Name | Type | Description | Required |
+| ---- | ---- | ----------- | -------- |
+| choices | [ [main.ChatChoice](#mainchatchoice) ] |  | No |
+| created | integer | *Example:* `1677652288` | No |
+| id | string | *Example:* `"chatcmpl-xxx"` | No |
+| model | string | *Example:* `"gpt-4"` | No |
+| object | string | *Example:* `"chat.completion"` | No |
+| usage | [main.ChatUsage](#mainchatusage) |  | No |
+
+#### main.ChatMessage
+
+| Name | Type | Description | Required |
+| ---- | ---- | ----------- | -------- |
+| content | string | Content: message content. For tool role, the result of the tool call.<br/>*Example:* `"Hello"` | No |
+| name | string | Name: optional name for the message author (e.g. to disambiguate multiple users). | No |
+| role | string | Role: "system" \| "user" \| "assistant" \| "tool". System sets behavior; user/assistant are conversation; tool is tool result.<br/>*Example:* `"user"` | No |
+| tool_call_id | string | ToolCallId: when role is "tool", the id of the tool call this result is for. Required for tool messages. | No |
+| tool_calls | [ [main.ChatToolCallItem](#mainchattoolcallitem) ] | ToolCalls: when role is "assistant" and the model called tools, array of { id, type, function: { name, arguments } }. | No |
+
+#### main.ChatResponseFormat
+
+| Name | Type | Description | Required |
+| ---- | ---- | ----------- | -------- |
+| json_schema |  | JsonSchema: when type is json_schema, optional schema for the output. | No |
+| type | string | Type: "text" or "json_object". | No |
+
+#### main.ChatTool
+
+| Name | Type | Description | Required |
+| ---- | ---- | ----------- | -------- |
+| function | [main.ChatToolFunction](#mainchattoolfunction) | Function: function definition (name, description, parameters). | No |
+| type | string | Type: must be "function".<br/>*Example:* `"function"` | No |
+
+#### main.ChatToolCallFunction
+
+| Name | Type | Description | Required |
+| ---- | ---- | ----------- | -------- |
+| arguments | string | Arguments: JSON string of the arguments. | No |
+| name | string | Name: name of the function to call. | No |
+
+#### main.ChatToolCallItem
+
+| Name | Type | Description | Required |
+| ---- | ---- | ----------- | -------- |
+| function | [main.ChatToolCallFunction](#mainchattoolcallfunction) | Function: name and arguments of the call. | No |
+| id | string | Id: ID of the tool call. | No |
+| type | string | Type: "function".<br/>*Example:* `"function"` | No |
+
+#### main.ChatToolFunction
+
+| Name | Type | Description | Required |
+| ---- | ---- | ----------- | -------- |
+| description | string | Description: optional description for the model. | No |
+| name | string | Name: name of the function. | No |
+| parameters |  | Parameters: optional JSON schema for the function arguments. | No |
+
+#### main.ChatUsage
+
+| Name | Type | Description | Required |
+| ---- | ---- | ----------- | -------- |
+| completion_tokens | integer | CompletionTokens: number of tokens in the completion. | No |
+| prompt_tokens | integer | PromptTokens: number of tokens in the prompt. | No |
+| total_tokens | integer | TotalTokens: total tokens (prompt + completion). | No |
+
+#### main.V1ModelItem
+
+| Name | Type | Description | Required |
+| ---- | ---- | ----------- | -------- |
+| created | integer | *Example:* `1626777600` | No |
+| id | string | *Example:* `"gpt-4"` | No |
+| object | string | *Example:* `"model"` | No |
+| owned_by | string | *Example:* `"openai"` | No |
+
+#### main.V1ModelsResponse
+
+| Name | Type | Description | Required |
+| ---- | ---- | ----------- | -------- |
+| data | [ [main.V1ModelItem](#mainv1modelitem) ] |  | No |
+| object | string | *Example:* `"list"` | No |
+| success | boolean | *Example:* `true` | No |
diff --git a/docs/llmservice/introduction.md b/docs/llmservice/introduction.md
@@ -0,0 +1,19 @@
+# Welcome to LLM Service
+
+## About LLM Service
+
+LLM Service is a professional AI service module within the Bank of AI ecosystem, built on top-tier blockchain infrastructure. It is dedicated to providing users with an efficient, user-friendly, and creative AI interaction experience. As a core AI service infrastructure of Bank of AI, this service leverages the decentralization, security, and high efficiency of blockchain technology to introduce a brand-new AI service model.
+
+The service's core features include:
+
+* **Multi-Model AI Chat:** We integrate various industry-leading Large Language Models (LLMs), allowing users to select the most suitable model based on their specific needs.
+* **Powerful Integrated AI Services:** We offer comprehensive AI-related API services, enabling users to access and integrate them rapidly and easily within the Bank of AI framework.
+* **Web3 Native Experience:** Through seamless integration with mainstream Web3 wallets, we provide an end-to-end native experience, from login to payment.
+
+## Why Choose LLM Service?
+
+Choosing our LLM Service means enjoying the unique advantages of a secure blockchain ecosystem alongside meticulously designed features.
+
+* **Multi-chain Ecosystem Advantages:** As part of the Bank of AI ecosystem, users can make payments using mainstream tokens on supported chains, benefiting from fast transaction confirmations and low fees.
+* **Low Cost & High Efficiency:** By optimizing resources and ensuring efficient on-chain interactions, we deliver highly cost-effective AI services to users.
+* **Security & Privacy Protection:** We utilize a decentralized login method. Users can complete authentication simply by signing with their Web3 wallet, ensuring greater security and privacy for all AI interactions.
diff --git a/docs/llmservice/models/chatgpt-5-2.md b/docs/llmservice/models/chatgpt-5-2.md
@@ -0,0 +1,30 @@
+# ChatGPT-5.2
+
+## Overview
+ChatGPT-5.2 is the latest generation of the flagship large language model developed by OpenAI. Building upon the powerful capabilities of the 5.1 version, it further optimizes the speed of multimodal processing and the execution efficiency of complex tasks, making it the ideal choice for professional users seeking ultimate performance and efficiency.
+
+## Key Features
+* **Efficient Multimodal Processing:** Significantly improves the parsing and generation speed of image and video content compared to 5.1, achieving a smoother multimodal interaction experience.
+* **Enhanced Task Execution Efficiency:** Optimizes the internal reasoning engine, allowing for faster and more accurate conclusions when handling long-chain, multi-step complex tasks.
+* **Stronger Interference Resistance:** Exhibits greater robustness and accuracy when processing inputs containing significant noise or ambiguous instructions.
+
+## Best Use Cases
+* **Real-time Data Analysis and Visualization:** Capable of quickly processing real-time data streams and generating complex charts and visualization reports.
+* **Complex Project Management and Planning:** Assists with task decomposition, resource allocation, and risk assessment for efficient decision support.
+* **High-Frequency, High-Precision Professional Consulting:** Suitable for professional fields requiring fast and accurate responses, such as financial trading analysis and legal document retrieval.
+
+## Capabilities and Limitations
+
+| Capability | Detailed Description |
+| :--- | :--- |
+| **Reasoning Ability** | Extremely Strong. Maintains a leading position in complex logical reasoning and scientific computation, with improved efficiency. |
+| **Creative Ability** | Extremely Strong. Can generate high-quality, in-depth content, particularly excelling in structured and professional texts. |
+| **Multimodal Ability** | Comprehensive and Efficient. Supports input and understanding of images, videos, and audio, and can quickly generate high-quality image content. |
+| **Response Speed** | Medium to Slow. Improved compared to 5.1, but still a deep analysis model, not suitable for extremely low-latency scenarios. |
+| **Context Window** | Huge. Supports a context window of millions of tokens. |
+
+## Credits and Pricing
+
+| Model | Input (Credits/Token) | Output (Credits/Token) |
+| :--- | :--- | :--- |
+| **ChatGPT-5.2** | 1.75 | 14.00 |
diff --git a/docs/llmservice/models/chatgpt-5-mini.md b/docs/llmservice/models/chatgpt-5-mini.md
@@ -0,0 +1,30 @@
+# ChatGPT-5-mini
+
+## Overview
+ChatGPT-5-mini is an efficient and economical lightweight language model. It is optimized for fast, smooth daily conversations and general tasks, making it a premier choice for cost-effective AI interaction within the Bank of AI ecosystem.
+
+## Key Features
+* **Extremely Fast Response:** Deeply optimized for low response latency, providing a near real-time conversation experience.
+* **High Cost-Effectiveness:** While ensuring high-quality output, its computational cost is significantly lower than flagship models, achieving a balance between performance and cost.
+* **Strong General Capabilities:** Covers a wide range of daily application scenarios, from quick Q&A to text organization, with stable and reliable performance.
+
+## Best Use Cases
+* **Daily Conversation and Quick Q&A:** Acts as a smart assistant for quickly answering factual questions and engaging in casual chat.
+* **Text Processing:** Summarizing, polishing, formatting, and extracting keywords from emails, articles, and documents.
+* **Initial Draft Generation:** Quickly generating drafts for social media posts, product descriptions, and blog articles.
+
+## Capabilities and Limitations
+
+| Capability | Detailed Description |
+| :--- | :--- |
+| **Reasoning Ability** | **Medium.** Can handle simple logical reasoning, but may falter on multi-step complex problems. |
+| **Creative Ability** | **Medium.** Generates fluent and coherent text, but is relatively limited in depth and professional creativity. |
+| **Multimodal Ability** | **Not Supported.** This model focuses on text processing and does not have image or audio understanding capabilities. |
+| **Response Speed** | **Fast.** One of the fastest responding models on the platform. |
+| **Context Window** | **Standard.** Supports tens of thousands of tokens, sufficient for most daily conversation scenarios. |
+
+## Credits and Pricing
+
+| Model | Input (Credits/Token) | Output (Credits/Token) |
+| :--- | :--- | :--- |
+| **ChatGPT-5-mini** | 0.25 | 2.00 |
diff --git a/docs/llmservice/models/chatgpt-5-nano.md b/docs/llmservice/models/chatgpt-5-nano.md
@@ -0,0 +1,30 @@
+# ChatGPT-5-nano
+
+## Overview
+ChatGPT-5-nano is an advanced language model that strikes an excellent balance between performance, speed, and cost. It is designed to provide near-professional AI capabilities at a moderate cost within the Bank of AI ecosystem.
+
+## Key Features
+* **Enhanced Reasoning Ability:** Nano shows significant improvements in logical reasoning, code generation, and multilingual processing compared to lighter models.
+* **Efficient Performance:** Carefully tuned to maintain high output quality while sustaining a fast response speed.
+* **Multifunctional Integration:** Capable of handling a diverse range of tasks, making it a powerful assistant for developers and content creators.
+
+## Best Use Cases
+* **Code Assistance and Debugging:** Understanding and generating code across multiple programming languages, assisting with debugging and documentation.
+* **Multilingual Translation and Writing:** Providing high-quality cross-language translation and creating content in authentic language styles.
+* **Structured Content Generation:** Generating well-formatted reports, technical documents, tutorials, and other structured content.
+
+## Capabilities and Limitations
+
+| Capability | Detailed Description |
+| :--- | :--- |
+| **Reasoning Ability** | **Strong.** Can handle complex logical problems and programming tasks, performing well in specific domains. |
+| **Creative Ability** | **Strong.** Generates creative and in-depth text content, meeting high writing requirements. |
+| **Multimodal Ability** | **Limited Support.** Can understand and describe simple image content, but does not support deep multimodal analysis. |
+| **Response Speed** | **Medium.** Faster than flagship models, though slightly slower than the mini model. |
+| **Context Window** | **Large.** Supports a context window of hundreds of thousands of tokens for long document processing. |
+
+## Credits and Pricing
+
+| Model | Input (Credits/Token) | Output (Credits/Token) |
+| :--- | :--- | :--- |
+| **ChatGPT-5-nano** | 0.05 | 0.40 |
diff --git a/docs/llmservice/models/claude-haiku-4-5.md b/docs/llmservice/models/claude-haiku-4-5.md
@@ -0,0 +1,30 @@
+# Claude Haiku 4.5
+
+## Overview
+Claude Haiku 4.5 is the fastest and most compact AI model developed by Anthropic, integrated into the Bank of AI platform. It is designed to provide near-instantaneous responses, making it the premier choice for building seamless, real-time AI experiences.
+
+## Key Features
+* **Unmatched Response Speed:** As the fastest model in its intelligence class, it delivers extremely low latency, ideal for applications requiring immediate interaction.
+* **Ultimate Cost-Effectiveness:** Highly competitive pricing makes it the most viable option for deploying AI at scale across massive user scenarios.
+* **Enterprise-Grade Robustness:** Undergone rigorous security testing to ensure the reliability and safety required for professional enterprise applications.
+
+## Best Use Cases
+* **Real-time Chatbots & Moderation:** Providing smooth, natural conversation experiences and quickly moderating user-generated content.
+* **Mobile AI Applications:** Optimized for mobile environments where latency and resource consumption are critical factors.
+* **Workflow Streamlining:** Automating routine tasks such as email classification, meeting summarization, and form data extraction to improve daily efficiency.
+
+## Capabilities and Limitations
+
+| Capability | Detailed Description |
+| :--- | :--- |
+| **Reasoning Ability** | **Medium.** Capable of handling general tasks, but limited in solving highly complex or multi-step reasoning problems. |
+| **Creative Ability** | **Medium.** Generates concise and fluent text, best suited for information delivery rather than deep creative writing. |
+| **Multimodal Ability** | **Supported.** Basic image understanding capabilities to identify and describe objects within visual inputs. |
+| **Response Speed** | **Extremely Fast.** The fastest responding model on the platform, enabling near-instantaneous interaction. |
+| **Context Window** | **Huge.** Supports an extra-long context window, capable of handling large documents and extensive conversation histories. |
+
+## Credits and Pricing
+
+| Model | Input (Credits/Token) | Output (Credits/Token) |
+| :--- | :--- | :--- |
+| **Claude Haiku 4.5** | 1.00 | 5.00 |