alauda · davidwtf · Feb 26, 2026 · Feb 26, 2026 · Feb 26, 2026 · Feb 26, 2026
diff --git a/docs/en/llama_stack/index.mdx b/docs/en/llama_stack/index.mdx
@@ -0,0 +1,6 @@
+---
+weight: 82
+---
+# Llama Stack
+
+<Overview />
diff --git a/docs/en/llama_stack/install.mdx b/docs/en/llama_stack/install.mdx
@@ -0,0 +1,76 @@
+---
+weight: 20
+---
+
+# Install Llama Stack
+
+This document describes how to install and deploy Llama Stack Server on Kubernetes using the Llama Stack Operator.
+
+## Upload Operator
+
+Download the Llama Stack Operator installation file (e.g., `llama-stack-operator.alpha.ALL.v0.7.0.tgz`).
+
+Use the violet command to publish to the platform repository:
+
+```bash
+violet push --platform-address=platform-access-address --platform-username=platform-admin --platform-password=platform-admin-password llama-stack-operator.alpha.ALL.v0.7.0.tgz
+```
+
+## Install Operator
+
+1. Go to the `Administrator` view in the Alauda Container Platform.
+
+2. In the left navigation, select `Marketplace` / `Operator Hub`.
+
+3. In the right panel, find `Alauda build of Llama Stack` and click `Install`.
+
+4. Keep all parameters as default and complete the installation.
+
+## Deploy Llama Stack Server
+
+After the operator is installed, deploy Llama Stack Server by creating a `LlamaStackDistribution` custom resource:
+
+> **Note:** Prepare the following in advance; otherwise the distribution may not become ready:
+> - **Secret**: Create a Secret (e.g., `deepseek-api`) in the same namespace with the LLM API token. Example: `kubectl create secret generic deepseek-api -n default --from-literal=token=<LLM_API_KEY>`.
+> - **Storage Class**: Ensure the `default` Storage Class exists in the cluster; otherwise the PVC cannot be bound and the resource will not become ready.
+
+```yaml
+apiVersion: llamastack.io/v1alpha1
+kind: LlamaStackDistribution
+metadata:
+  annotations:
+    cpaas.io/display-name: ""
+  name: demo
+  namespace: default
+spec:
+  network:
+    exposeRoute: false                             # Whether to expose the route externally
+  replicas: 1                                      # Number of server replicas
+  server:
+    containerSpec:
+      env:
+        - name: VLLM_URL
+          value: "https://api.deepseek.com/v1"     # URL of the LLM API provider
+        - name: VLLM_MAX_TOKENS
+          value: "8192"                            # Maximum output tokens
+        - name: VLLM_API_TOKEN                     # Load LLM API token from secret
+          valueFrom:
+            secretKeyRef:                          # Create this Secret in the same namespace beforehand, e.g. kubectl create secret generic deepseek-api -n default --from-literal=token=<LLM_API_KEY>
+              key: token
+              name: deepseek-api
+      name: llama-stack
+      port: 8321
+    distribution:
+      name: starter                                # Distribution name (options: starter, postgres-demo, meta-reference-gpu)
+    storage:
+      mountPath: /home/lls/.lls
+      size: 20Gi                                   # Requires the "default" Storage Class to be configured beforehand
+```
+
+After deployment, the Llama Stack Server will be available within the cluster. The access URL is displayed in `status.serviceURL`, for example:
+
+```yaml
+status:
+  phase: Ready
+  serviceURL: http://demo-service.default.svc.cluster.local:8321
+```
diff --git a/docs/en/llama_stack/overview/features.mdx b/docs/en/llama_stack/overview/features.mdx
@@ -0,0 +1,29 @@
+---
+weight: 20
+---
+
+# Main Features
+
+## Server-Based Architecture
+
+- **Centralized Server**: Llama Stack Server hosts inference, agents, safety, tool runtime, vector I/O, and files
+- **Remote or Inline Providers**: Support for remote APIs (e.g., OpenAI-compatible) and inline providers (e.g., meta-reference, sqlite-vec, localfs)
+- **Kubernetes Deployment**: Deploy via Llama Stack Operator using `LlamaStackDistribution` custom resources
+
+## AI Agents with Tools
+
+- **Agent Creation**: Create agents with model, instructions, and a list of tools
+- **Client-Side Tools**: Define tools with the `@client_tool` decorator; the client executes tool calls and returns results to the server
+- **Session Management**: Create sessions and run multi-turn conversations with streaming responses
+- **Streaming**: Support for streaming agent responses for real-time display
+
+## Configuration and Extensibility
+
+- **Stack Configuration**: YAML-based configuration for APIs, providers, persistence (e.g., kv_default, sql_default), and models
+- **Environment Fallbacks**: Use `${env.VAR:~default}` in config for flexible deployment
+- **Multiple Distributions**: Starter, postgres-demo, meta-reference-gpu and other distribution options
+
+## Integration
+
+- **Python Client**: `llama-stack-client` for Python 3.12+ with full agent and model APIs
+- **REST-Friendly**: Server exposes APIs for inference, agents, and tool runtime; can be wrapped in FastAPI or other web frameworks for production use
diff --git a/docs/en/llama_stack/overview/index.mdx b/docs/en/llama_stack/overview/index.mdx
@@ -0,0 +1,7 @@
+---
+weight: 10
+---
+
+# Overview
+
+<Overview />
diff --git a/docs/en/llama_stack/overview/intro.mdx b/docs/en/llama_stack/overview/intro.mdx
@@ -0,0 +1,28 @@
+---
+weight: 10
+---
+# Introduction
+
+## Llama Stack
+
+*Llama Stack* is a framework for building and running AI agents with tools. It provides a server-based architecture that enables developers to create agents that can interact with users, access external tools, and perform complex reasoning tasks.
+
+Main components and concepts include:
+
+- **Llama Stack Server**: Central service that hosts models, agents, and tool runtime. It can be deployed on Kubernetes via the Llama Stack Operator (see [Install Llama Stack](/en/llama_stack/install)).
+- **Client SDK** (`llama-stack-client`): Python client for connecting to the server, creating agents, defining tools with the `@client_tool` decorator, and managing sessions.
+- **Agents**: Configurable AI agents that use LLM models and can call tools (e.g., weather API, custom APIs) to answer user queries.
+- **Tools**: Functions exposed to the agent (e.g., weather query). Defined with `@client_tool` and passed to the agent at creation time.
+- **Configuration**: YAML stack configuration defines providers (inference, agents, safety, vector_io, files), persistence backends, and model registration (e.g., DeepSeek via OpenAI-compatible API).
+
+Llama Stack supports multiple API providers, storage and persistence backends, and distribution options (e.g., starter, postgres-demo, meta-reference-gpu), making it suitable for quick experiments and production deployments.
+
+## Documentation
+
+Llama Stack provides official documentation and resources for in-depth usage:
+
+### Official Documentation
+- **Main Documentation**: [https://llamastack.github.io/docs](https://llamastack.github.io/docs)
+  - Usage, API providers, and core concepts
+- **Core Concepts**: [https://llamastack.github.io/docs/concepts](https://llamastack.github.io/docs/concepts)
+  - Architecture, API stability, and resource management
diff --git a/docs/en/llama_stack/quickstart.mdx b/docs/en/llama_stack/quickstart.mdx
@@ -0,0 +1,78 @@
+---
+weight: 30
+---
+
+# Quickstart
+
+This section provides a quickstart example for creating an AI Agent with Llama Stack.
+
+## Prerequisites
+
+- Python 3.12 or higher (if not satisfied, refer to [FAQ: How to prepare Python 3.12 in Notebook](#how-to-prepare-python-312-in-notebook))
+- Llama Stack Server installed and running via Operator (see [Install Llama Stack](./install))
+- Access to a Notebook environment (e.g., Jupyter Notebook, JupyterLab)
+- Python environment with `llama-stack-client` and required dependencies installed
+- API key for the LLM provider (e.g., DeepSeek API key)
+
+## Quickstart Example
+
+A simple example of creating an AI Agent with Llama Stack is available in the following resources:
+
+- **Notebook**:[Llama Stack Quick Start Demo](/llama-stack/llama-stack_quickstart.ipynb)
+
+Download the notebook and upload it to a Notebook environment to run.
+
+The notebook demonstrates:
+
+- Connecting to Llama Stack Server and client setup
+- Tool definition using the `@client_tool` decorator (weather query tool example)
+- Client connection to Llama Stack Server
+- Model selection and Agent creation with tools and instructions
+- Agent execution with session management and streaming responses
+- Result handling and display
+- Optional FastAPI deployment example
+
+## FAQ
+
+### How to prepare Python 3.12 in Notebook
+
+1. Download the pre-compiled Python installation package:
+
+   ```bash
+   wget -O /tmp/python312.tar.gz https://github.com/astral-sh/python-build-standalone/releases/download/20260114/cpython-3.12.12+20260114-x86_64-unknown-linux-gnu-install_only.tar.gz
+   ```
+
+2. Extract with:
+
+   ```bash
+   mkdir -p ~/python312
+   tar -xzf /tmp/python312.tar.gz -C ~/python312 --strip-components=1
+   ```
+
+3. Install and Register Kernel:
+
+   ```bash
+   export PATH="${HOME}/python312/bin:${PATH}"
+
+   python3 -m pip install ipykernel
+   python3 -m ipykernel install --user --name python312 --display-name "Python 3.12"
+   ```
+
+4. Switch kernel in the notebook page:
+
+   - Open your Notebook environment (e.g., Jupyter Notebook or JupyterLab) in the browser, then open an existing notebook or create a new one.
+   - In the notebook interface, find the current kernel name (usually shown in the **top-right corner** of the page, e.g., "Python 3" or "python3").
+   - Click that kernel name, or use the menu **Kernel → Change Kernel**.
+   - In the kernel list, select **"Python 3.12"** (the display name registered in step 3).
+   - After switching, new cells will run with Python 3.12.
+
+**Note**: When executing python and pip commands directly in the notebook page, the default python will still be used. You need to specify the full path to use the python312 version commands.
+
+## Additional Resources
+
+For more resources on developing AI Agents with Llama Stack, see:
+
+- [Llama Stack Documentation](https://llamastack.github.io/docs) - The official Llama Stack documentation covering all usage-related topics, API providers, and core concepts.
+- [Llama Stack Core Concepts](https://llamastack.github.io/docs/concepts) - Deep dive into Llama Stack architecture, API stability, and resource management.
+- [Llama Stack GitHub Repository](https://github.com/llamastack/llama-stack) - Source code, example applications, distribution configurations, and how to add new API providers.
+- [Llama Stack Example Apps](https://github.com/llamastack/llama-stack-apps/) - Official examples demonstrating how to use Llama Stack in various scenarios.