Skip to content

Update for kfbase installation doc for upgrade#180

Merged
typhoonzero merged 8 commits intomasterfrom
update_kfbase_install_with_gateway
Apr 10, 2026
Merged

Update for kfbase installation doc for upgrade#180
typhoonzero merged 8 commits intomasterfrom
update_kfbase_install_with_gateway

Conversation

@typhoonzero
Copy link
Copy Markdown
Contributor

@typhoonzero typhoonzero commented Apr 9, 2026

Summary by CodeRabbit

  • Documentation
    • Added a Kubeflow installation guide detailing supported plugins, prerequisites, Dex redirect and oauth2-proxy setup (ASM v1/v2), onboarding, and deployment steps for kfbase, kfp, model-registry, and kubeflow-trainer.
    • Added an upgrade guide with version-specific post-upgrade actions for dashboard access (gateway vs NodePort) and pg storage adjustments.
    • Removed the previous Kubeflow Chart Plugins installation page.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 9, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Replaces the old Kubeflow installation page with two new docs: kubeflow/install.mdx (detailed install for Kubeflow plugins on Alauda AI ≥2.0, Dex/oauth2-proxy, ASM v2 notes, onboarding, Model Registry/operator, kubeflow-trainer) and kubeflow/upgrade.mdx (version-specific post-upgrade actions and NodePort fallback instructions).

Changes

Cohort / File(s) Summary
Removed installation doc
docs/en/installation/kubeflow.mdx
Deleted legacy Kubeflow installation page and all its installation guidance.
New install doc
docs/en/kubeflow/install.mdx
Adds comprehensive install guide for Kubeflow plugins (kfbase, kfp, model-registry-operator, deprecated kftraining / kubeflow-trainer), Dex redirect format, oauth2-proxy setup with ASM v1 (deprecated) and ASM v2 steps, Istio webhook cleanup for ASM v2, plugin onboarding (violet push), deployment validation, RBAC/Istio policy snippets, and ModelRegistry operator usage.
New upgrade doc
docs/en/kubeflow/upgrade.mdx
Adds post-upgrade steps: switch dashboard access from NodePort to gateway for upgrades ≤ v1.10.13 (DNS/gateway IP instructions) and manual pgStorageClass/pgStorageSize handling for upgrades ≤ v1.10.9 (rescale, PVC removal, storage recreation steps).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested reviewers

  • zhaomingkun1030

Poem

🐰 I hopped through lines and nudged each file,
Replaced an old page with two new, neat and agile.
Redirects set, operator seeds sown,
Trainers will run, registries grown.
A little hop — docs polished and warm.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Title check ⚠️ Warning The title mentions only 'kfbase installation doc for upgrade' but the PR actually relocates and restructures documentation across multiple files, creating separate installation and upgrade guides in a new directory structure. Consider a more accurate title like 'Reorganize Kubeflow installation documentation into separate install and upgrade guides' to reflect the actual scope of changes.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch update_kfbase_install_with_gateway

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/en/installation/kubeflow.mdx`:
- Line 153: Update the awkward confirmation sentence in the kubeflow
installation doc: replace “which `<your-kubeflow-domain>` is the
`kubeflowDomain` configured in the kfbase plugin deployment page” with “where
`<your-kubeflow-domain>` is the `kubeflowDomain` configured in the kfbase plugin
deployment page” in the sentence starting “To Confirm: In Administrator - System
Settings - Platform Parameters…” so the grammar reads correctly.
- Around line 155-157: The doc currently instructs users to run `kubectl -n
istio-system get gateway kubeflow-external-gateway` which doesn't yield the
external IP; update the installation note in kubeflow.mdx to provide two
concrete retrieval commands and guidance: for clusters using the Kubernetes
Gateway API (`gateway.networking.k8s.io`) extract the external address from the
Gateway status (use a jsonpath to read `.status.addresses[0].value` from the
`kubeflow-external-gateway` Gateway), and for Istio Gateway
(`networking.istio.io`) query the ingress Service (e.g., `istio-ingressgateway`
in `istio-system`) and extract its loadBalancer ingress IP (use a jsonpath to
read `.status.loadBalancer.ingress[0].ip`); replace the previous generic
`kubectl -n istio-system get gateway kubeflow-external-gateway` instruction with
these two options and a short note to choose the appropriate command based on
the cluster API group, while keeping the reference to retrieving
`kubeflowDomain` from the kfbase plugin configuration.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7828d9a6-0ade-47ad-92da-8a8069448548

📥 Commits

Reviewing files that changed from the base of the PR and between 3a6fd32 and f942e33.

📒 Files selected for processing (1)
  • docs/en/installation/kubeflow.mdx

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Apr 9, 2026

Deploying alauda-ai with  Cloudflare Pages  Cloudflare Pages

Latest commit: 7c7eb71
Status: ✅  Deploy successful!
Preview URL: https://9f8da09e.alauda-ai.pages.dev
Branch Preview URL: https://update-kfbase-install-with-g.alauda-ai.pages.dev

View logs

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
docs/en/installation/kubeflow.mdx (1)

155-159: ⚠️ Potential issue | 🟠 Major

Clarify external address retrieval commands; current instruction is not actionable as written.

Line 155/157/159 still instructs kubectl ... get gateway kubeflow-external-gateway without extracting an address field, and it does not disambiguate Kubernetes Gateway API vs Istio Gateway behavior. This can block DNS/hosts configuration during upgrade.

🔧 Suggested revision
-**Configure DNS Resolution or Host File for `<your-kubeflow-domain>`**: Configure the DNS resolution for `<your-kubeflow-domain>` to point to the IP address of the gateway assigned IP address by `kubectl -n istio-system get gateway kubeflow-external-gateway`, or simply add an entry in the host file of your local machine to map `<your-kubeflow-domain>` to the gateway assigned IP address.
+**Configure DNS Resolution or Host File for `<your-kubeflow-domain>`**: Configure DNS (or your local hosts file) so `<your-kubeflow-domain>` resolves to the external address of `kubeflow-external-gateway`.
+If your cluster uses Kubernetes Gateway API (`gateway.networking.k8s.io`), retrieve it with:
+`kubectl -n istio-system get gateway kubeflow-external-gateway -o jsonpath='{.status.addresses[0].value}'`
+If your cluster uses Istio Gateway (`networking.istio.io`), retrieve it from the ingress Service with:
+`kubectl -n istio-system get svc istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}'`

-> Note: if you are upgrading from `v1.10.13` or older versions of kfbase plugin, the old versions of kfbase use `NodePort` to access the Kubeflow dashboard. After the upgrade, you should configure the DNS resolution or host file for `<your-kubeflow-domain>` to point to the IP address of the gateway assigned IP address, which is different from the old `NodePort` way of accessing. You can get the domain name in the `kubeflowDomain` field of the kfbase plugin configuration, and get the gateway assigned IP address by `kubectl -n istio-system get gateway kubeflow-external-gateway`.
+> Note: If you are upgrading from `v1.10.13` or older kfbase versions, older releases used `NodePort` for dashboard access. After upgrade, configure DNS/hosts so `<your-kubeflow-domain>` points to the gateway-assigned address instead.

-> Note: if you still need to access the Kubeflow dashboard through `NodePort` after the upgrade, you manually modify the service `istio-system/kubeflow-istio-ingressgateway` to type `NodePort`, and get the NodePort assigned port by `kubectl -n istio-system get service kubeflow-istio-ingressgateway` (nodeport for 443 port). Then you can access the Kubeflow dashboard through `https://<ip-of-master-node-of-the-cluster>:<NodePort>/`.
+> Note: If you still need NodePort access after upgrade, you can manually modify service `istio-system/kubeflow-istio-ingressgateway` to `NodePort`, then get the 443 NodePort via `kubectl -n istio-system get service kubeflow-istio-ingressgateway`.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/installation/kubeflow.mdx` around lines 155 - 159, Update the
instructions to explicitly show how to retrieve the gateway external address
(different for Gateway API vs Istio Service) instead of just telling the reader
to run kubectl get gateway kubeflow-external-gateway; specifically instruct to
check the Gateway API status.addresses or the Istio ingress Service
(istio-system/kubeflow-istio-ingressgateway) and show how to extract the
external IP/hostname to map to the kubeflowDomain value in the kfbase plugin
configuration (and when using NodePort, explain how to get the NodePort for port
443 from the kubeflow-istio-ingressgateway service to form
https://<ip>:<NodePort>/). Ensure references to the resource names
kubeflow-external-gateway, kubeflowDomain, and
istio-system/kubeflow-istio-ingressgateway are kept so readers know exactly
which objects to query.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/en/installation/kubeflow.mdx`:
- Around line 26-28: Change the grammar in the Kubeflow domain mapping sentence
so it reads “<your-kubeflow-domain> should be set to kubeflowDomain configured
in the kfbase plugin deployment page.” — update the sentence that references
`<your-kubeflow-domain>` and the configuration key `kubeflowDomain` (the text in
the "Platform Access URLs" instructions) to use “should be set to” instead of
“should set to.”

---

Duplicate comments:
In `@docs/en/installation/kubeflow.mdx`:
- Around line 155-159: Update the instructions to explicitly show how to
retrieve the gateway external address (different for Gateway API vs Istio
Service) instead of just telling the reader to run kubectl get gateway
kubeflow-external-gateway; specifically instruct to check the Gateway API
status.addresses or the Istio ingress Service
(istio-system/kubeflow-istio-ingressgateway) and show how to extract the
external IP/hostname to map to the kubeflowDomain value in the kfbase plugin
configuration (and when using NodePort, explain how to get the NodePort for port
443 from the kubeflow-istio-ingressgateway service to form
https://<ip>:<NodePort>/). Ensure references to the resource names
kubeflow-external-gateway, kubeflowDomain, and
istio-system/kubeflow-istio-ingressgateway are kept so readers know exactly
which objects to query.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 912e015f-3e0c-4ed0-b557-bd15dd2c3e8c

📥 Commits

Reviewing files that changed from the base of the PR and between f942e33 and 35f0b08.

📒 Files selected for processing (1)
  • docs/en/installation/kubeflow.mdx

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
docs/en/installation/kubeflow.mdx (2)

28-28: ⚠️ Potential issue | 🟡 Minor

Fix grammar in the Kubeflow domain mapping sentence.

Line 28 should use “should be set to” for correct English.

✏️ Suggested wording
-- `<your-kubeflow-domain>` should set to `kubeflowDomain` configured in the kfbase plugin deployment page.
+- `<your-kubeflow-domain>` should be set to the `kubeflowDomain` configured on the kfbase plugin deployment page.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/installation/kubeflow.mdx` at line 28, Update the sentence containing
`<your-kubeflow-domain>` and `kubeflowDomain` in the Kubeflow mapping line so it
reads "should be set to" instead of "should set to" (i.e., change the phrase to
"`<your-kubeflow-domain>` should be set to `kubeflowDomain` configured in the
kfbase plugin deployment page").

155-155: ⚠️ Potential issue | 🟠 Major

Use actionable gateway address retrieval commands (API-group specific).

Line 155 and Line 161 currently point to kubectl ... get gateway ... as if it directly provides an external IP. That is incomplete and differs by Gateway type, which can block DNS/hosts setup.

🛠️ Suggested revision
-**Configure DNS Resolution or Host File for `<your-kubeflow-domain>`**: Configure the DNS resolution for `<your-kubeflow-domain>` to point to the IP address of the gateway assigned IP address by `kubectl -n istio-system get gateway kubeflow-external-gateway`, or simply add an entry in the host file of your local machine to map `<your-kubeflow-domain>` to the gateway assigned IP address.
+**Configure DNS Resolution or Host File for `<your-kubeflow-domain>`**: Configure DNS (or your local hosts file) so `<your-kubeflow-domain>` resolves to the gateway external address.
+If your cluster uses Kubernetes Gateway API, run:
+`kubectl -n istio-system get gateway kubeflow-external-gateway -o jsonpath='{.status.addresses[0].value}'`
+If your cluster uses Istio `networking.istio.io` Gateway, run:
+`kubectl -n istio-system get svc istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}'`

Run this to verify which path applies in your environment:

#!/bin/bash
set -euo pipefail

echo "Gateway apiVersion:"
kubectl -n istio-system get gateway kubeflow-external-gateway -o jsonpath='{.apiVersion}{"\n"}' || true

echo "Gateway status.addresses (Gateway API clusters):"
kubectl -n istio-system get gateway kubeflow-external-gateway -o jsonpath='{range .status.addresses[*]}{.type}={.value}{"\n"}{end}' || true

echo "Ingress Service external address (Istio Gateway clusters):"
kubectl -n istio-system get svc istio-ingressgateway -o jsonpath='{range .status.loadBalancer.ingress[*]}{.ip}{" "}{.hostname}{"\n"}{end}' || true

Also applies to: 161-161

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/installation/kubeflow.mdx` at line 155, The doc currently assumes
`kubectl -n istio-system get gateway kubeflow-external-gateway` directly yields
an external IP; change the guidance to first check the Gateway resource
apiVersion and status (use `kubectl -n istio-system get gateway
kubeflow-external-gateway -o jsonpath=...` to inspect `.apiVersion` and
`.status.addresses`) and, if the cluster uses the older Istio Gateway pattern,
fall back to querying the ingress Service `istio-ingressgateway` (use `kubectl
-n istio-system get svc istio-ingressgateway -o jsonpath=...` to read
`.status.loadBalancer.ingress[*].ip` or `.hostname`); update the DNS/hosts
instructions to explain both paths and to use the corresponding returned address
(Gateway API `.status.addresses[*].value` vs Service loadBalancer IP/hostname).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/en/installation/kubeflow.mdx`:
- Line 17: The sentence starting "Deploy ASM in the business cluster where
Kubeflow is to be deployed (if ASM was not deployed in the previous step)..." is
ambiguous because the prior step installs KServe, not ASM; update that clause to
clearly state when ASM must be installed (e.g., "if ASM was not already deployed
earlier in this guide" or "if ASM has not been deployed in this
environment/cluster") and keep the note about ASM v2 being required—modify the
parenthetical to read something like "(if ASM has not already been deployed
earlier in this guide or in this cluster; ASM v2 is required—see ASM
documentation)" so readers know the exact condition.
- Line 163: The sentence explaining the NodePort fallback is grammatically
awkward and missing the modal verb "can"; update the sentence in the docs so it
reads clearly and includes "can" (e.g., "If you still need to access the
Kubeflow dashboard through NodePort after the upgrade, you can manually modify
the service `istio-system/kubeflow-istio-ingressgateway` to type `NodePort`,
then get the NodePort assigned for port 443 by running `kubectl -n istio-system
get service kubeflow-istio-ingressgateway`; you can then access the Kubeflow
dashboard at `https://<ip-of-master-node-of-the-cluster>:<NodePort>/`.") Ensure
the phrase "NodePort assigned for port 443" replaces "nodeport for 443 port" and
keep the service name `istio-system/kubeflow-istio-ingressgateway` and the
kubectl command exactly as identifiers to locate the text.

---

Duplicate comments:
In `@docs/en/installation/kubeflow.mdx`:
- Line 28: Update the sentence containing `<your-kubeflow-domain>` and
`kubeflowDomain` in the Kubeflow mapping line so it reads "should be set to"
instead of "should set to" (i.e., change the phrase to "`<your-kubeflow-domain>`
should be set to `kubeflowDomain` configured in the kfbase plugin deployment
page").
- Line 155: The doc currently assumes `kubectl -n istio-system get gateway
kubeflow-external-gateway` directly yields an external IP; change the guidance
to first check the Gateway resource apiVersion and status (use `kubectl -n
istio-system get gateway kubeflow-external-gateway -o jsonpath=...` to inspect
`.apiVersion` and `.status.addresses`) and, if the cluster uses the older Istio
Gateway pattern, fall back to querying the ingress Service
`istio-ingressgateway` (use `kubectl -n istio-system get svc
istio-ingressgateway -o jsonpath=...` to read
`.status.loadBalancer.ingress[*].ip` or `.hostname`); update the DNS/hosts
instructions to explain both paths and to use the corresponding returned address
(Gateway API `.status.addresses[*].value` vs Service loadBalancer IP/hostname).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9c17b346-0996-4eee-8256-21b1639340c4

📥 Commits

Reviewing files that changed from the base of the PR and between 35f0b08 and 381d6a3.

📒 Files selected for processing (1)
  • docs/en/installation/kubeflow.mdx

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
docs/en/kubeflow/install.mdx (1)

17-19: Consider minor rewording to improve flow.

The static analysis tool flagged that items 1, 2, and 3 start with similar patterns. While this doesn't impact clarity, you could optionally rephrase for better readability:

 1. An ACP environment is available and running.
-2. Alauda AI is already deployed. Alauda AI 2.0 or later is required.
-3. Alauda Build of KServe is installed.
+2. Alauda AI 2.0 or later is deployed.
+3. The Alauda Build of KServe is installed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/kubeflow/install.mdx` around lines 17 - 19, Reword the three bullets
in the installation checklist in docs/en/kubeflow/install.mdx so they don’t all
start the same way: edit the bullet list (items currently beginning "An ACP
environment is available and running", "Alauda AI is already deployed. Alauda AI
2.0 or later is required.", and "Alauda Build of KServe is installed") to use
varied phrasing/concise forms (for example: "ACP environment is running",
"Alauda AI (v2.0+) is deployed", "Alauda Build of KServe is installed") to
improve flow and readability. Ensure the meaning and version requirement (Alauda
AI 2.0+) remain unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/en/kubeflow/install.mdx`:
- Around line 154-157: Update the docs text that references "kubectl -n
istio-system get gateway kubeflow-external-gateway" to clarify that this returns
a Gateway object and show how to extract the actual external IP/host depending
on topology: for LoadBalancer instruct users to query the istio-ingressgateway
service and include the example kubectl command using jsonpath to read
.status.loadBalancer.ingress[0].ip or .hostname (e.g., kubectl -n istio-system
get service istio-ingressgateway -o
jsonpath='{.status.loadBalancer.ingress[0].ip}'); for NodePort mention using a
Kubernetes node IP combined with the NodePort from the istio-ingressgateway
service (kubectl -n istio-system get service istio-ingressgateway -o
jsonpath='{.spec.ports[?(@.name=="http2")].nodePort}'); and replace the original
sentence to instruct users to use the appropriate command above to obtain the
IP/host and port when configuring Platform Access URLs and local DNS/hosts
entries.
- Around line 86-90: The docs delete commands use a version-specific webhook
name (istio-sidecar-injector-1-22) which may not match a user's ASM v1
installation; add guidance to first list and verify webhook names (e.g., run
kubectl get mutatingwebhookconfigurations and kubectl get
validatingwebhookconfigurations and filter for "istio") and instruct users to
replace the versioned name (istio-sidecar-injector-1-22) with the exact name
returned before running the kubectl delete commands so they don’t accidentally
miss or delete the wrong webhook.
- Around line 282-290: Verify whether the minimum Kubernetes requirement stated
in the sentence "Kubeflow Trainer v2 requires Kubernetes `1.32.3` or later." is
accurate; if the patch version is not essential, change it to a minor-version
requirement like "Kubernetes 1.32 or later" and update the sentence accordingly,
and if 1.32.3 is required, add a brief rationale listing the specific
APIs/features (e.g., kubelet/CRD changes, admission API fixes, or specific API
versions) that necessitate 1.32.3, citing upstream Kubernetes release notes for
the 1.32.x line and the exact feature(s) used by kubeflow-trainer; ensure the
edited text in the "Deploy `kubeflow-trainer` (Kubeflow Trainer v2)" section
includes the verified minimum version and one short explanatory clause
explaining why that version is required.

---

Nitpick comments:
In `@docs/en/kubeflow/install.mdx`:
- Around line 17-19: Reword the three bullets in the installation checklist in
docs/en/kubeflow/install.mdx so they don’t all start the same way: edit the
bullet list (items currently beginning "An ACP environment is available and
running", "Alauda AI is already deployed. Alauda AI 2.0 or later is required.",
and "Alauda Build of KServe is installed") to use varied phrasing/concise forms
(for example: "ACP environment is running", "Alauda AI (v2.0+) is deployed",
"Alauda Build of KServe is installed") to improve flow and readability. Ensure
the meaning and version requirement (Alauda AI 2.0+) remain unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4a8be636-cb6a-49b0-8427-ef9d43630a51

📥 Commits

Reviewing files that changed from the base of the PR and between 381d6a3 and ed05ddf.

📒 Files selected for processing (3)
  • docs/en/installation/kubeflow.mdx
  • docs/en/kubeflow/install.mdx
  • docs/en/kubeflow/upgrade.mdx
💤 Files with no reviewable changes (1)
  • docs/en/installation/kubeflow.mdx
✅ Files skipped from review due to trivial changes (1)
  • docs/en/kubeflow/upgrade.mdx

Comment on lines +86 to +90
```bash
kubectl delete validatingwebhookconfigurations istiod-default-validator
kubectl delete mutatingwebhookconfigurations istio-sidecar-injector-1-22
kubectl delete mutatingwebhookconfigurations istio-revision-tag-default
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Check for Istio mutating webhook configurations that may need cleanup

# Search for Istio-related mutating webhooks
kubectl get mutatingwebhookconfigurations -o name | rg -i istio

Repository: alauda/aml-docs

Length of output: 104


🏁 Script executed:

# Read the install.mdx file to see the full context around lines 86-90
head -100 docs/en/kubeflow/install.mdx | tail -20

Repository: alauda/aml-docs

Length of output: 812


🏁 Script executed:

# Get full context of the webhook deletion section
sed -n '80,100p' docs/en/kubeflow/install.mdx

Repository: alauda/aml-docs

Length of output: 816


🏁 Script executed:

# Search for other webhook references in the documentation
rg -i "webhook" docs/en/kubeflow/install.mdx -B2 -A2

Repository: alauda/aml-docs

Length of output: 431


🏁 Script executed:

# Search for any Istio version-specific guidance in the docs
rg -i "istio|version|1-22|1\.22" docs/en/kubeflow/install.mdx

Repository: alauda/aml-docs

Length of output: 1557


Add guidance on identifying webhook names for different Istio versions.

The webhook name istio-sidecar-injector-1-22 includes a version-specific suffix ("1-22" for Istio 1.22). While the note already uses conditional language ("If any ASM v1 webhooks are still present"), users may struggle to identify the correct webhook names if their ASM v1 installation used a different Istio version. Consider adding a command snippet showing how to list existing webhooks:

kubectl get mutatingwebhookconfigurations -o name | grep istio

This helps users verify the exact webhook names in their environment before deletion.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/kubeflow/install.mdx` around lines 86 - 90, The docs delete commands
use a version-specific webhook name (istio-sidecar-injector-1-22) which may not
match a user's ASM v1 installation; add guidance to first list and verify
webhook names (e.g., run kubectl get mutatingwebhookconfigurations and kubectl
get validatingwebhookconfigurations and filter for "istio") and instruct users
to replace the versioned name (istio-sidecar-injector-1-22) with the exact name
returned before running the kubectl delete commands so they don’t accidentally
miss or delete the wrong webhook.

Comment on lines +154 to +157

- In **Administrator** > **System Settings** > **Platform Parameters**, verify that **Platform Access URLs** contains an address in the format `https://<your-kubeflow-domain>`, where `<your-kubeflow-domain>` is the `kubeflowDomain` configured for the `kfbase` plugin.
- Configure DNS resolution, or add a local hosts entry, so that `<your-kubeflow-domain>` resolves to the IP address assigned to `kubectl -n istio-system get gateway kubeflow-external-gateway`.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Clarify how to extract the gateway IP address.

Line 156 states "IP address assigned to kubectl -n istio-system get gateway kubeflow-external-gateway", but this command returns the Gateway object, not directly an IP address. Users need additional steps to extract the actual IP.

Consider providing the complete command or clarifying the extraction process, such as:

  • If using a LoadBalancer: Get the IP from the associated Istio ingress service
  • If using NodePort: Specify which node IP and port to use
  • Provide the full kubectl command with appropriate output formatting
📝 Suggested clarification
-- Configure DNS resolution, or add a local hosts entry, so that `<your-kubeflow-domain>` resolves to the IP address assigned to `kubectl -n istio-system get gateway kubeflow-external-gateway`.
+- Configure DNS resolution, or add a local hosts entry, so that `<your-kubeflow-domain>` resolves to the external IP of the Istio ingress gateway. You can get this IP with:
+  ```bash
+  kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}'
+  ```
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- In **Administrator** > **System Settings** > **Platform Parameters**, verify that **Platform Access URLs** contains an address in the format `https://<your-kubeflow-domain>`, where `<your-kubeflow-domain>` is the `kubeflowDomain` configured for the `kfbase` plugin.
- Configure DNS resolution, or add a local hosts entry, so that `<your-kubeflow-domain>` resolves to the IP address assigned to `kubectl -n istio-system get gateway kubeflow-external-gateway`.
- In **Administrator** > **System Settings** > **Platform Parameters**, verify that **Platform Access URLs** contains an address in the format `https://<your-kubeflow-domain>`, where `<your-kubeflow-domain>` is the `kubeflowDomain` configured for the `kfbase` plugin.
- Configure DNS resolution, or add a local hosts entry, so that `<your-kubeflow-domain>` resolves to the external IP of the Istio ingress gateway. You can get this IP with:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/kubeflow/install.mdx` around lines 154 - 157, Update the docs text
that references "kubectl -n istio-system get gateway kubeflow-external-gateway"
to clarify that this returns a Gateway object and show how to extract the actual
external IP/host depending on topology: for LoadBalancer instruct users to query
the istio-ingressgateway service and include the example kubectl command using
jsonpath to read .status.loadBalancer.ingress[0].ip or .hostname (e.g., kubectl
-n istio-system get service istio-ingressgateway -o
jsonpath='{.status.loadBalancer.ingress[0].ip}'); for NodePort mention using a
Kubernetes node IP combined with the NodePort from the istio-ingressgateway
service (kubectl -n istio-system get service istio-ingressgateway -o
jsonpath='{.spec.ports[?(@.name=="http2")].nodePort}'); and replace the original
sentence to instruct users to use the appropriate command above to obtain the
IP/host and port when configuring Platform Access URLs and local DNS/hosts
entries.

Comment on lines +282 to +290
### 6. Deploy `kubeflow-trainer` (Kubeflow Trainer v2)

> **Note:** If `kftraining` is already deployed, uninstall it before deploying `kubeflow-trainer`.
>
> **Note:** Install the LWS plugin before deploying `kubeflow-trainer`, because LWS is a dependency of `kubeflow-trainer`.
>
> **Note:** Kubeflow Trainer v2 requires Kubernetes `1.32.3` or later. Older Kubernetes versions may lead to unexpected behavior.

In **Cluster Plugins**, find `kubeflow-trainer`, click **Install**, choose whether to enable `JobSet`, and complete the installation.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Verify the Kubernetes version requirement.

Line 288 states "Kubeflow Trainer v2 requires Kubernetes 1.32.3 or later." This is a very specific version requirement. Please verify:

  1. Is this the correct minimum version?
  2. Should this be a minor version requirement (e.g., "1.32 or later") rather than a specific patch version?
  3. What specific features or APIs introduced in 1.32.3 are required?

If the version is correct, consider adding a brief explanation of why this specific version is required to help users understand the rationale.

What is the latest stable version of Kubernetes and when was Kubernetes 1.32 released?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/kubeflow/install.mdx` around lines 282 - 290, Verify whether the
minimum Kubernetes requirement stated in the sentence "Kubeflow Trainer v2
requires Kubernetes `1.32.3` or later." is accurate; if the patch version is not
essential, change it to a minor-version requirement like "Kubernetes 1.32 or
later" and update the sentence accordingly, and if 1.32.3 is required, add a
brief rationale listing the specific APIs/features (e.g., kubelet/CRD changes,
admission API fixes, or specific API versions) that necessitate 1.32.3, citing
upstream Kubernetes release notes for the 1.32.x line and the exact feature(s)
used by kubeflow-trainer; ensure the edited text in the "Deploy
`kubeflow-trainer` (Kubeflow Trainer v2)" section includes the verified minimum
version and one short explanatory clause explaining why that version is
required.

@xZhizhizhizhizhi
Copy link
Copy Markdown

/test-pass

@typhoonzero typhoonzero merged commit 0cae9f0 into master Apr 10, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants