Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 49 additions & 21 deletions docs/en/overview/architecture.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,28 +7,56 @@ The diagram below illustrates the architecture of the Alauda AI platform.

![architecture](./assets/architecture.png)

NOTE: Alauda AI uses some general Kubernetes, ACP components including:
## Component Description

* ALB
* Erebus
* kube-apiserver (kubernetes component)
### Components in Alauda Container Platform Layer

| Component | Description | Type | License |
| --- | --- | --- | --- |
| Lich | Alauda AI UI console | Self-developed | |
| aml-operator | Manages installation and life cycles of Alauda AI components | Self-developed | |
| aml-apiserver | Extends kubernetes api-server and provide authorization enhancements for Alauda AI API access | Self-developed | |
| skipper & oauth2-proxy | Proxies traffic from the global cluster to workload clusters. Traffic is authenticated by oauth2-proxy | Open source | Apache Version 2.0 |
| aml-controller | Manages Alauda AI namespaces on workload clusters. Namespaces will be automatically configured a Model Repo space and corresponding resources. | Self-developed | |
| aml-api-deploy | Provides high-level APIs for "Lich" | Self-developed | |
| Gitlab (with Minio or S3) | Model repository backend storage and version tracking. | Open source | MIT |
| kserve-controller | (Optionally with knative serving enabled) Manages AI inference services and inference service runtimes. | Open source | Apache Version 2.0 |
| workspace-controller | Manages workbench instances (jupyter notebooks, codeserver) | Open source | Apache Version 2.0 |
| Volcano | Plugin to provide co-scheduling (gang-scheduling) features for AI training jobs. Also manages "volcanojob" resource to run general training workloads. | Open source | Apache Version 2.0 |
| MLFlow | Track training, evaluation jobs by storing, visualizing metrics and artifacts | Open source | Apache Version 2.0 |
| Fine Tuning | Experimental UI providing no-code LLM fine tunning job creation and management | Self-developed | |
| Kubeflow | Open source plugin providing MLOps features including: Notebooks, Tensorboard, Kubeflow pipeline, training operator. | Open source | Apache Version 2.0 |
| Label Studio | Open source plugin for dataset labeling | Open source | Apache Version 2.0 |
| Dify | Open source plugin for creating LLM Agents, RAG applications using a web UI | Open source | ```<br>a modified version of the Apache License 2.0<br>``` |
| Evidently | Open source plugin for monitoring online inference service performance and data drifts | Open source | Apache Version 2.0 |
| GPU device plugins | HAMi and nvidia gpu device plugin | Open source | Apache Version 2.0 |
| GPU (Alauda Build of Nvidia GPU Device Plugin) | Provides GPU resources for AI workloads | Open source | Apache Version 2.0 |
| HAMi (Alauda Build of Hami, Alauda Build of Hami-WebUI) | GPU resource slicing, sharing and scheduling | Open source | Apache Version 2.0 |
| Alauda Build of DCGM-Exporter | GPU monitoring | Open source | Apache Version 2.0 |
| Alauda Build of NPU Operator | Provides NPU resources for AI workloads | Open source | Apache Version 2.0 |
| Alauda Build of Node Feature Discovery | Detects hardware features of cluster nodes | Open source | Apache Version 2.0 |
| DRA (Alauda build of NVIDIA DRA Driver for GPUs) | Dynamic Resource Allocation for GPU sharing | Open source | Apache Version 2.0 |
| Volcano (Alauda support for Volcano) | Batch job scheduling for AI workloads | Open source | Apache Version 2.0 |
| Kueue (Alauda Build of Kueue) | Job scheduling for AI workloads | Open source | Apache Version 2.0 |
| Milvus (Alauda Build of Milvus) | Vector database for embedding storage and retrieval | Open source | Apache Version 2.0 |
Copy link
Copy Markdown
Contributor

@zhaomingkun1030 zhaomingkun1030 Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

按全量写还是列举一些主要的?还少一些,比如PG Vector(图里有)、LWS(图里没有)、etc

| PGVector (Alauda support for PostgreSQL) | PostgreSQL extension for vector similarity search | Open source | The PostgreSQL License |


### Components in AI Platform Layer

| Component | Description | Type | License |
| --- | --- | --- | --- |
| Model Catalog (Alauda AI/Alauda AI Essentials) | Centralized repository for managing AI models and their metadata | Proprietary | Commercial |
| Model Registry (Alauda support for Kubeflow Model Registry) | Keep track of AI model versions and metadata for each namespace | Open source | Apache Version 2.0 |
| Datasets (Alauda AI/Alauda AI Essentials) | Centralized repository for managing datasets and their metadata | Proprietary | Commercial |
| Labeling (Alauda support for Label Studio) | Data labeling tool for creating labeled datasets | Open source | Apache Version 2.0 |
| Feature Store (Alauda support for FeatureForm) | Centralized repository for managing and serving machine learning features | Open source | Mozilla Public License (MPL) |
| Workbench (Alauda AI Workbench) | Web-based interface for managing AI projects, including model training and inference | Proprietary | Commercial |
| Training Jobs (Alauda support for Kubeflow Trainer v2) | Kubernetes-native training job management | Open source | Apache Version 2.0 |
| Kubeflow Pipelines (Alauda support for Kubeflow Base & Alauda support for Kubeflow Pipeline) | Workflow orchestration for AI pipelines | Open source | Apache Version 2.0 |
| Guardrails (Coming soon) | AI safety and governance framework | Open source | Apache Version 2.0 |
| Drift & Bias Detection (Alauda support for Evidently) | Monitoring for model performance degradation and bias | Open source | Apache Version 2.0 |
| Experiment Tracking (Alauda support for MLFlow) | Tracking and comparing machine learning experiments | Open source | Apache Version 2.0 |


### Components in GenAI Platform Layer

| Component | Description | Type | License |
| --- | --- | --- | --- |
| Kserve (Alauda AI Model Serving/Alauda Generative AI) | Kubernetes-native model serving framework | Open source | Apache Version 2.0 |
| vLLM (Alauda AI Model Serving/Alauda Generative AI) | High-performance model inference engine for large language models | Open source | Apache Version 2.0 |
| llm-d (Alauda Generative AI) | Distributed inference engine for large language models | Open source | Apache Version 2.0 |
| Model as a Service (Alauda build of Envoy AI Gateway) | API gateway for serving AI models as a service | Open source | Apache Version 2.0 |
| Fine-tuning | Tools integrated with the workbench for fine-tuning large language models, e.g. transformers, accelerate, llama-factory etc. | Open source | - |
| Training (Alauda support for Kubeflow Trainer v2) | Kubernetes-native training job management | Open source | Apache Version 2.0 |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential duplicate of "Training Jobs" entry.

Line 54 "Training (Alauda support for Kubeflow Trainer v2)" appears to duplicate Line 38 "Training Jobs (Alauda support for Kubeflow Trainer v2)" with an identical description. Consider removing this duplicate or clarifying how they differ.

Suggested fix
-| Training (Alauda support for Kubeflow Trainer v2) | Kubernetes-native training job management | Open source | Apache Version 2.0 |

Or if they serve different purposes, update the descriptions to differentiate them.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| Training (Alauda support for Kubeflow Trainer v2) | Kubernetes-native training job management | Open source | Apache Version 2.0 |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/overview/architecture.mdx` at line 54, The two table entries
"Training (Alauda support for Kubeflow Trainer v2)" and "Training Jobs (Alauda
support for Kubeflow Trainer v2)" are duplicates; either remove one of these
rows or edit their descriptions so they clearly serve different purposes—update
the text for "Training" and/or "Training Jobs" to reflect distinct scopes (e.g.,
one for conceptual training features and the other for concrete training job
management) and ensure only one row remains with the accurate description.

| Model Quantization | Tools integrated with the workbench for model quantization, e.g. llm-compressor etc. | Open source | - |
| Evaluation | Tools integrated with the workbench for evaluating model performance, e.g. lm-evaluation-harness etc. | Open source | - |
| Llama Stack (Alauda build of Llama Stack) | Framework for building applications with large language models | Open source | MIT |
| Langchain | Tools integrated with the workbench for building LLM applications using Langchain | Open source | MIT |
| Dify (Alauda support for Dify) | Platform for building AI assistants and chatbots | Open source | Apache Version 2.0 (modified) |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

langgenius/dify official license Apache-2.0 modified terms

💡 Result:

langgenius/dify is not plain Apache-2.0; it uses the “Dify Open Source License”, described as a modified Apache License 2.0 with additional conditions. [1][2]

Apache-2.0 “modified terms” / additional conditions (from the official LICENSE)

  1. Multi-tenant (SaaS) restriction
    You may not use the Dify source code to operate a multi-tenant environment unless explicitly authorized in writing (tenant ≈ one workspace). [1]

  2. Logo / copyright removal restriction (frontend)
    If you use Dify’s frontend, you may not remove or modify the Dify LOGO or copyright information in the console/applications. This does not apply if your use does not involve Dify’s frontend (frontend defined as web/ or the Docker “web” image). [1]

  3. Contributor-specific terms
    Contributors agree that:

  • the producer can change the open-source agreement to be more strict or relaxed, and
  • contributed code may be used for commercial purposes (including cloud business operations). [1]

The license then states that other rights/restrictions follow Apache License 2.0. [1]

Sources: official Dify LICENSE text (raw) [1] and the repo license summary/label on GitHub [2].

[1] https://raw.githubusercontent.com/langgenius/dify/main/LICENSE
[2] https://github.com/langgenius/dify


🏁 Script executed:

# Check if there's a LICENSE or COPYING file in the repository
find . -type f -name "LICENSE*" -o -name "COPYING*" | head -20

Repository: alauda/aml-docs

Length of output: 41


🏁 Script executed:

# Search for license information in the repository root and common locations
if [ -f "LICENSE" ]; then
  echo "=== LICENSE file ===" 
  head -50 LICENSE
fi
if [ -f "LICENSE.md" ]; then
  echo "=== LICENSE.md file ===" 
  head -50 LICENSE.md
fi
if [ -f "LICENSE.txt" ]; then
  echo "=== LICENSE.txt file ===" 
  head -50 LICENSE.txt
fi

Repository: alauda/aml-docs

Length of output: 41


Update license expression to match Dify's official license name and include key restrictions.

The documentation currently shows Apache Version 2.0 (modified), which is ambiguous and incomplete. Dify uses the "Dify Open Source License" — an Apache-2.0-based license with specific additional restrictions that should be documented:

  • Multi-tenant (SaaS) restriction: Commercial multi-tenant deployments require explicit written authorization
  • Logo/copyright restriction (frontend): The Dify logo and copyright notice in the UI cannot be removed or modified when using Dify's frontend components

Clarify the license row to reference the actual license name and either link to the official LICENSE file or document these key terms to avoid misleading users about permitted use cases.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/overview/architecture.mdx` at line 59, Update the table row for "Dify
(Alauda support for Dify)" in docs/en/overview/architecture.mdx to replace the
ambiguous "Apache Version 2.0 (modified)" with the correct license name "Dify
Open Source License" and either add a parenthetical link to the project's
official LICENSE or append a short clarification listing the two key
restrictions: "Commercial multi-tenant (SaaS) deployments require explicit
written authorization" and "Dify frontend components must retain the Dify logo
and copyright notice (cannot be removed/modified)"; ensure the updated cell text
is concise and clearly references the LICENSE for full terms.

| MCP Servers | Can integrate with various MCP servers | - | - |
| Agent Tracing (Alauda support for MLflow) | Tracing and monitoring for AI agents | Open source | Apache Version 2.0 |
Comment on lines +42 to +61
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use consistent MLflow casing across the document.

Line 41 uses MLFlow while Line 60 uses MLflow. Please normalize to one form for consistency.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/overview/architecture.mdx` around lines 41 - 60, Normalize the casing
of the MLFlow/MLflow mentions to a single form across the document; specifically
change "Experiment Tracking (Alauda support for MLFlow)" to match "Agent Tracing
(Alauda support for MLflow)" (use "MLflow" everywhere) so both the "Experiment
Tracking" and "Agent Tracing" lines use the same "MLflow" casing.

| Agent Evaluation | Tools integrated with the workbench for evaluating AI agents, e.g. RAGAS etc. | Open source | - |
Comment on lines +53 to +62
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Replace open-source license placeholders with explicit values.

On Lines 52-58 and Line 61, Type is Open source but License is -. In an architecture/licensing table this creates compliance ambiguity. Use explicit SPDX-style license values, or clearly mark Multiple/Varies (see component docs) with references.

Suggested table fix pattern
-| Fine-tuning | Tools integrated with the workbench for fine-tuning large language models, e.g. transformers, accelerate, llama-factory etc. | Open source | - |
+| Fine-tuning | Tools integrated with the workbench for fine-tuning large language models, e.g. transformers, accelerate, llama-factory etc. | Open source | Multiple/Varies (see component docs) |

-| Model Quantization | Tools integrated with the workbench for model quantization, e.g. llm-compressor etc. | Open source | - |
+| Model Quantization | Tools integrated with the workbench for model quantization, e.g. llm-compressor etc. | Open source | Multiple/Varies (see component docs) |
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| Fine-tuning | Tools integrated with the workbench for fine-tuning large language models, e.g. transformers, accelerate, llama-factory etc. | Open source | - |
| Training (Alauda support for Kubeflow Trainer v2) | Kubernetes-native training job management | Open source | Apache Version 2.0 |
| Model Quantization | Tools integrated with the workbench for model quantization, e.g. llm-compressor etc. | Open source | - |
| Evaluation | Tools integrated with the workbench for evaluating model performance, e.g. lm-evaluation-harness etc. | Open source | - |
| Llama Stack (Alauda build of Llama Stack) | Framework for building applications with large language models | Open source | - |
| Langchain | Tools integrated with the workbench for building LLM applications using Langchain | Open source | - |
| Dify (Alauda support for Dify) | Platform for building AI assistants and chatbots | Open source | - |
| MCP Servers | Can integrate with various MCP servers | - | - |
| Agent Tracing (Alauda support for MLflow) | Tracing and monitoring for AI agents | Open source | Apache Version 2.0 |
| Agent Evaluation | Tools integrated with the workbench for evaluating AI agents, e.g. RAGAS etc. | Open source | - |
| Fine-tuning | Tools integrated with the workbench for fine-tuning large language models, e.g. transformers, accelerate, llama-factory etc. | Open source | Multiple/Varies (see component docs) |
| Training (Alauda support for Kubeflow Trainer v2) | Kubernetes-native training job management | Open source | Apache Version 2.0 |
| Model Quantization | Tools integrated with the workbench for model quantization, e.g. llm-compressor etc. | Open source | Multiple/Varies (see component docs) |
| Evaluation | Tools integrated with the workbench for evaluating model performance, e.g. lm-evaluation-harness etc. | Open source | Multiple/Varies (see component docs) |
| Llama Stack (Alauda build of Llama Stack) | Framework for building applications with large language models | Open source | Multiple/Varies (see component docs) |
| Langchain | Tools integrated with the workbench for building LLM applications using Langchain | Open source | Multiple/Varies (see component docs) |
| Dify (Alauda support for Dify) | Platform for building AI assistants and chatbots | Open source | Multiple/Varies (see component docs) |
| MCP Servers | Can integrate with various MCP servers | - | - |
| Agent Tracing (Alauda support for MLflow) | Tracing and monitoring for AI agents | Open source | Apache Version 2.0 |
| Agent Evaluation | Tools integrated with the workbench for evaluating AI agents, e.g. RAGAS etc. | Open source | Multiple/Varies (see component docs) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/overview/architecture.mdx` around lines 52 - 61, The table rows for
components like "Fine-tuning", "Model Quantization", "Evaluation", "Llama
Stack", "Langchain", "Dify (Alauda support for Dify)", "MCP Servers", and "Agent
Evaluation" currently list Type="Open source" but License="-" which creates
ambiguity; update each row (e.g., the "Fine-tuning" row, "Training (Alauda
support for Kubeflow Trainer v2)" row, "Model Quantization", "Evaluation",
"Llama Stack", "Langchain", "Dify (Alauda support for Dify)", "MCP Servers", and
"Agent Evaluation") to use explicit SPDX-style license identifiers where known,
or replace "-" with a clear value such as "Multiple/Varies (see component docs)"
and add a short parenthetical or link note pointing to the component docs for
license details to remove ambiguity.

Binary file modified docs/en/overview/assets/architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.