Skip to content

feat: provide hardware profile#161

Open
gdzamg wants to merge 4 commits intomasterfrom
feat/add-hardware-profile
Open

feat: provide hardware profile#161
gdzamg wants to merge 4 commits intomasterfrom
feat/add-hardware-profile

Conversation

@gdzamg
Copy link
Copy Markdown

@gdzamg gdzamg commented Mar 24, 2026

Summary by CodeRabbit

  • Documentation
    • Added comprehensive Hardware Profile docs: overview, admin workflows for create/update/delete, CPU vs GPU examples, CLI creation guide, and guidance on scheduling with node selectors and tolerations; UI selection, constraint viewing and validation behavior described.
    • Revised Inference Service publishing docs: clarified creation flows, restructured parameter descriptions and templates, and updated model/source and deployment configuration fields.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 24, 2026

Walkthrough

Adds a set of documentation pages for Hardware Profile management (concepts, admin workflows, CLI/how‑to guides, CPU/GPU examples, scheduling) and updates the Inference Service creation flow and parameter tables to reflect model source and deployment config changes.

Changes

Cohort / File(s) Summary
Hardware Profile Overview
docs/en/infrastructure_management/hardware_profile/index.mdx, docs/en/infrastructure_management/hardware_profile/intro.mdx
New overview and intro pages defining HardwareProfile, its resource identifiers, scheduling (node selectors/tolerations), webhook validation, and relationship to InferenceService/LLMInferenceService.
Functional Guides
docs/en/infrastructure_management/hardware_profile/functions/index.mdx, docs/en/infrastructure_management/hardware_profile/functions/hardware_profile.mdx
New functional docs detailing admin workflows to create, update, and delete HardwareProfile resources, including resource boundaries (min/default/optional max), scheduling rules, and UI selection/validation behavior.
How‑To Guides
docs/en/infrastructure_management/hardware_profile/how_to/index.mdx, docs/en/infrastructure_management/hardware_profile/how_to/create_hardware_profile_cli.mdx, docs/en/infrastructure_management/hardware_profile/how_to/cpu_and_gpu_profiles.mdx, docs/en/infrastructure_management/hardware_profile/how_to/schedule_to_specific_gpu_nodes.mdx
New practical guides with CLI manifest examples, CPU-only and GPU-enabled YAML samples, node selector and toleration guidance for scheduling onto specific GPU nodes.
Inference Service update
docs/en/model_inference/inference_service/functions/inference_service.mdx
Restructured the Inference Service creation flow (new entry points), moved step ordering, added explicit Model Location for Custom Publish, and reworked parameter description tables and formatting to reflect conditional fields and deployment config distinctions.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • ASM-21393 refine inference docs #4 — Overlapping edits to docs/en/model_inference/inference_service/functions/inference_service.mdx (restructured publishing/workflow content).

Suggested reviewers

  • zhaomingkun1030
  • typhoonzero

Poem

🐰 Hop, hop, the docs now gleam and glow,
Profiles for hardware help workloads go,
CPU, GPU, YAML in a row,
Selectors guide where containers flow,
A rabbit cheers — deployments grow! 🚀

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: provide hardware profile' directly and clearly describes the main change in the pull request, which is the addition of comprehensive documentation for the Hardware Profile Management feature across multiple new documentation pages.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/add-hardware-profile

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@docs/en/infrastructure_management/hardware_profile/functions/hardware_profile.mdx`:
- Line 5: Update the document heading text that currently reads "Hardware
Profile Manage" to the corrected phrase "Hardware Profile Management" in the
heading line (the H1 that starts with "# Hardware Profile Manage") so the title
uses proper grammar; replace only the heading text and keep the surrounding
markdown syntax intact.

In
`@docs/en/infrastructure_management/hardware_profile/how_to/create_hardware_profile_cli.mdx`:
- Line 17: Update the prerequisite sentence that currently reads "You have a
namespace where you have permissions to view or create `HardwareProfile`
resources (typically a cluster-scoped resource or in a specific admin
namespace)" to state that `HardwareProfile` is a namespaced resource and users
must have permissions in the specific target namespace; reference the example
YAML's namespace (e.g., `namespace: kube-public`) as an example and remove the
"cluster-scoped" wording so the doc correctly requires namespace-level
permissions to view/create `HardwareProfile`.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5c407a22-b403-4c68-bdb7-0bd98f28d733

📥 Commits

Reviewing files that changed from the base of the PR and between 904d33b and 5838417.

📒 Files selected for processing (8)
  • docs/en/infrastructure_management/hardware_profile/functions/hardware_profile.mdx
  • docs/en/infrastructure_management/hardware_profile/functions/index.mdx
  • docs/en/infrastructure_management/hardware_profile/how_to/cpu_and_gpu_profiles.mdx
  • docs/en/infrastructure_management/hardware_profile/how_to/create_hardware_profile_cli.mdx
  • docs/en/infrastructure_management/hardware_profile/how_to/index.mdx
  • docs/en/infrastructure_management/hardware_profile/how_to/schedule_to_specific_gpu_nodes.mdx
  • docs/en/infrastructure_management/hardware_profile/index.mdx
  • docs/en/infrastructure_management/hardware_profile/intro.mdx

@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying alauda-ai with  Cloudflare Pages  Cloudflare Pages

Latest commit: 5838417
Status: ✅  Deploy successful!
Preview URL: https://ba41d778.alauda-ai.pages.dev
Branch Preview URL: https://feat-add-hardware-profile.alauda-ai.pages.dev

View logs

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (9)
docs/en/model_inference/inference_service/functions/inference_service.mdx (1)

99-99: Optional: Use hyphenated "cloud-native" as compound adjective.

When "cloud native" modifies a noun (in this case, "inference services"), it should be hyphenated as "cloud-native" per standard English grammar for compound adjectives.

✏️ Suggested change
-AML completes the release and operation of cloud native inference services based on [kserve](https://github.com/kserve/kserve) InferenceService CRD. If you are familiar with the use of kserve, you can also click the "YAML" button in the upper right corner when "Publish inference service directly from model" to directly modify the YAML file to complete more advanced release operations.
+AML completes the release and operation of cloud-native inference services based on [kserve](https://github.com/kserve/kserve) InferenceService CRD. If you are familiar with the use of kserve, you can also click the "YAML" button in the upper right corner when "Publish inference service directly from model" to directly modify the YAML file to complete more advanced release operations.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/model_inference/inference_service/functions/inference_service.mdx` at
line 99, Replace the unhyphenated phrase "cloud native inference services" with
the hyphenated compound adjective "cloud-native inference services" in the
sentence that begins "AML completes the release and operation of cloud native
inference services..." (look for the phrase "cloud native inference services" in
inference_service.mdx) so the compound adjective correctly modifies "inference
services."
docs/en/infrastructure_management/hardware_profile/how_to/create_hardware_profile_cli.mdx (2)

66-66: Optional: Remove redundant namespace flag.

The -n kube-public flag is redundant because the YAML file already specifies namespace: kube-public in the metadata (line 28). While not incorrect, removing it simplifies the command and relies on the namespace declared in the manifest.

🔧 Suggested simplification
-kubectl apply -f gpu-high-performance-profile.yaml -n kube-public
+kubectl apply -f gpu-high-performance-profile.yaml
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@docs/en/infrastructure_management/hardware_profile/how_to/create_hardware_profile_cli.mdx`
at line 66, The kubectl command in create_hardware_profile_cli.mdx redundantly
includes the namespace flag; update the line containing "kubectl apply -f
gpu-high-performance-profile.yaml -n kube-public" to remove the "-n kube-public"
portion so the command relies on the manifest's metadata namespace instead.

15-17: Optional: Consider varying sentence structure in prerequisites.

The three prerequisite bullets all begin with "You have," creating repetitive phrasing. While acceptable for technical documentation, consider rewording for improved readability.

✏️ Suggested rewording
 - You have access to a Kubernetes cluster with the platform installed.
-- You have configured `kubectl` to communicate with your cluster.
-- You have permissions in the namespace where `HardwareProfile` resources are managed (for example, an admin namespace such as `kube-public`).
+- `kubectl` is configured to communicate with your cluster.
+- Permissions are available in the namespace where `HardwareProfile` resources are managed (for example, an admin namespace such as `kube-public`).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@docs/en/infrastructure_management/hardware_profile/how_to/create_hardware_profile_cli.mdx`
around lines 15 - 17, The three prerequisite bullets in
create_hardware_profile_cli.mdx all start with "You have," creating repetitive
phrasing; change the three bullets to vary sentence structure by using different
sentence starters (e.g., "Access a Kubernetes cluster with the platform
installed," "Configure kubectl to communicate with your cluster," "Ensure you
have permissions in the namespace where HardwareProfile resources are managed"),
preserve the meaning and examples (like kube-public) and keep them concise and
parallel in tone.
docs/en/infrastructure_management/hardware_profile/functions/hardware_profile.mdx (6)

122-122: Simplify awkward phrasing.

The phrase "strictly must conceptually remain within the range" is awkward and overly emphatic. Simplify for clarity.

✏️ Suggested rewording
-* Depending on your precise workload needs, click the **Custom Configuration** button displayed dynamically below the hardware profile section. Custom requests and limits strictly must conceptually remain *within the range* defined by the hardware profile's minimum and maximum constraints.
+* Depending on your workload needs, click the **Custom Configuration** button below the hardware profile section. Custom requests and limits must remain within the range defined by the hardware profile's minimum and maximum constraints.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@docs/en/infrastructure_management/hardware_profile/functions/hardware_profile.mdx`
at line 122, Replace the awkward sentence "Custom requests and limits strictly
must conceptually remain *within the range* defined by the hardware profile's
minimum and maximum constraints." with a simpler, clearer sentence such as:
"Custom requests and limits must remain within the hardware profile's minimum
and maximum constraints." Locate that exact sentence in the content (the line
following the Custom Configuration button note) and update it to the simplified
phrasing.

49-52: Optional: Simplify emphatic language.

The section uses phrases like "rigidly control" and "physically correct node pools" which may be overly emphatic. Consider more neutral phrasing such as "To control which nodes..." and "appropriate node pools."

✏️ Suggested rewording
-To rigidly control which nodes the inference workload schedule applies to, set Node Selectors and Tolerations. This ensures high-performance workloads land on the physically correct node pools.
+To control which nodes the inference workload schedule applies to, set Node Selectors and Tolerations. This ensures high-performance workloads land on the appropriate node pools.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@docs/en/infrastructure_management/hardware_profile/functions/hardware_profile.mdx`
around lines 49 - 52, Update the "Step 5: Configure node scheduling rules" text
to use neutral wording: replace phrases like "To rigidly control which nodes the
inference workload schedule applies to" with "To control which nodes the
inference workload is scheduled on" and change "physically correct node pools"
to "appropriate node pools"; keep the instructional content about Node Selectors
and Tolerations intact (references: the "Step 5: Configure node scheduling
rules" heading, the "Node Selectors" and "Tolerations" sections).

101-101: Simplify emphatic language in the deletion impact note.

The note contains excessive adverbs ("safely," "flawlessly," "immediately") that don't add technical clarity. The core information is accurate but can be stated more concisely.

✏️ Suggested rewording
-*Note: Deleting a hardware profile does not delete or actively disrupt running inference services that previously deployed with this profile. They will continue to operate flawlessly with the resource limitations and topology constraints initially injected by the platform's webhook. However, the deleted hardware profile will immediately disappear from the profile selection dropdown for all newly created deployments.*
+*Note: Deleting a hardware profile does not delete or disrupt running inference services that were deployed with this profile. They will continue to operate with the resource limitations and topology constraints injected by the platform's webhook. However, the deleted hardware profile will no longer appear in the profile selection dropdown for newly created deployments.*
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@docs/en/infrastructure_management/hardware_profile/functions/hardware_profile.mdx`
at line 101, Edit the deletion impact note in the hardware profile documentation
to remove excessive emphatic adverbs and tighten the sentence; locate the
paragraph that begins with "Note: Deleting a hardware profile..." in
hardware_profile.mdx and replace the sentence with a concise version that
states: deleting a profile does not remove or disrupt existing inference
services deployed with that profile, and the profile will no longer appear in
the selection dropdown for new deployments.

94-94: Optional: Simplify "far right side" to "right side."

The phrase "far right side" can be simplified to "right side" for conciseness, as the action menu location is clear without the intensifier "far."

✏️ Suggested change
-Click the Action menu (⋮) on the far right side of the relevant hardware profile row, and securely select **Delete**.
+Click the Action menu (⋮) on the right side of the relevant hardware profile row, and select **Delete**.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@docs/en/infrastructure_management/hardware_profile/functions/hardware_profile.mdx`
at line 94, Replace the phrase "far right side" with "right side" in the
sentence "Click the Action menu (⋮) on the far right side of the relevant
hardware profile row, and securely select **Delete**." so it reads "Click the
Action menu (⋮) on the right side of the relevant hardware profile row, and
securely select **Delete**."

82-82: Simplify emphatic language in the update impact note.

The note contains excessive adverbs ("firmly," "explicitly") that don't add technical value. The core information about update impact is accurate and important, but the phrasing can be more concise.

✏️ Suggested rewording
-*Note: Updating a hardware profile typically affects solely newly configured workloads going forward. Active deployments previously instantiated using this hardware profile will firmly preserve their originally injected constraints. To enforce the new hardware profile settings on an already-running workload, you must explicitly edit or redeploy the corresponding inference service.*
+*Note: Updating a hardware profile affects only newly configured workloads. Active deployments using this hardware profile preserve their original constraints. To apply the new settings to an already-running workload, edit or redeploy the corresponding inference service.*
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@docs/en/infrastructure_management/hardware_profile/functions/hardware_profile.mdx`
at line 82, The note that begins "Note: Updating a hardware profile typically
affects..." uses unnecessary emphatic adverbs ("firmly," "explicitly"); replace
that sentence with a concise version that removes those adverbs while keeping
the meaning: state that updates only apply to newly configured workloads and
that to apply changes to running deployments you must edit or redeploy the
corresponding inference service (refer to the sentence starting "Note: Updating
a hardware profile..." and the phrase "inference service" to locate the text).

13-15: Optional: Vary prerequisite sentence structure.

Three successive prerequisites begin with "You have" or "You are," creating repetitive phrasing. Consider varying the structure for improved readability, though this is acceptable for prerequisite lists.

✏️ Suggested rewording
 * You have logged in to the platform as a user with administrator privileges.
-- You have verified your desired computing resources, including CPU, memory, and any specialized accelerators (e.g., GPU models) available in the underlying Kubernetes cluster.
-- You are familiar with Kubernetes scheduling concepts such as Node Selectors, Taints, and Tolerations.
+* Your desired computing resources, including CPU, memory, and any specialized accelerators (e.g., GPU models), are verified as available in the underlying Kubernetes cluster.
+* Familiarity with Kubernetes scheduling concepts such as Node Selectors, Taints, and Tolerations is required.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@docs/en/infrastructure_management/hardware_profile/functions/hardware_profile.mdx`
around lines 13 - 15, The three prerequisite bullets all start with "You have"
or "You are," causing repetitive phrasing; update the three lines ("You have
logged in to the platform as a user with administrator privileges.", "You have
verified your desired computing resources, including CPU, memory, and any
specialized accelerators (e.g., GPU models) available in the underlying
Kubernetes cluster.", "You are familiar with Kubernetes scheduling concepts such
as Node Selectors, Taints, and Tolerations.") to vary sentence openings—for
example, change one to an imperative ("Log in to the platform as an
administrator"), another to a noun phrase ("Verified computing resources: CPU,
memory, and any accelerators (e.g., GPUs) on the cluster"), and the third to a
short descriptor ("Familiarity with Kubernetes scheduling concepts (Node
Selectors, Taints, Tolerations)"), ensuring meaning is preserved and tone
remains consistent with the rest of the document.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@docs/en/infrastructure_management/hardware_profile/functions/hardware_profile.mdx`:
- Line 107: Fix the typo "officially boundaries" to "official boundaries" and
simplify the verbose sentence at line containing "A hardware profile seamlessly
streamlines the tedious task..." by replacing it with a clearer, shorter
sentence (for example: "A hardware profile simplifies node scheduling and
resource limits; you can use the strict defaults or adjust limits within the
selected profile's official boundaries."). Update the sentence in
hardware_profile.mdx where that exact phrasing appears.

In
`@docs/en/infrastructure_management/hardware_profile/how_to/create_hardware_profile_cli.mdx`:
- Line 84: The doc currently references selecting "GPU High Performance" in the
UI but the example HardwareProfile uses the name gpu-high-performance-profile
and has no profile-level displayName; update the example HardwareProfile (the
HardwareProfile spec/metadata) to include a profile-level displayName (e.g.,
displayName: "GPU High Performance") and add one sentence explaining that
displayName is shown in the UI while the metadata.name
(gpu-high-performance-profile) is the resource identifier, or alternatively
explain how the UI derives a human-friendly label from metadata.name if you
prefer not to add displayName; refer to HardwareProfile, metadata.name
(gpu-high-performance-profile) and displayName in the change.

---

Nitpick comments:
In
`@docs/en/infrastructure_management/hardware_profile/functions/hardware_profile.mdx`:
- Line 122: Replace the awkward sentence "Custom requests and limits strictly
must conceptually remain *within the range* defined by the hardware profile's
minimum and maximum constraints." with a simpler, clearer sentence such as:
"Custom requests and limits must remain within the hardware profile's minimum
and maximum constraints." Locate that exact sentence in the content (the line
following the Custom Configuration button note) and update it to the simplified
phrasing.
- Around line 49-52: Update the "Step 5: Configure node scheduling rules" text
to use neutral wording: replace phrases like "To rigidly control which nodes the
inference workload schedule applies to" with "To control which nodes the
inference workload is scheduled on" and change "physically correct node pools"
to "appropriate node pools"; keep the instructional content about Node Selectors
and Tolerations intact (references: the "Step 5: Configure node scheduling
rules" heading, the "Node Selectors" and "Tolerations" sections).
- Line 101: Edit the deletion impact note in the hardware profile documentation
to remove excessive emphatic adverbs and tighten the sentence; locate the
paragraph that begins with "Note: Deleting a hardware profile..." in
hardware_profile.mdx and replace the sentence with a concise version that
states: deleting a profile does not remove or disrupt existing inference
services deployed with that profile, and the profile will no longer appear in
the selection dropdown for new deployments.
- Line 94: Replace the phrase "far right side" with "right side" in the sentence
"Click the Action menu (⋮) on the far right side of the relevant hardware
profile row, and securely select **Delete**." so it reads "Click the Action menu
(⋮) on the right side of the relevant hardware profile row, and securely select
**Delete**."
- Line 82: The note that begins "Note: Updating a hardware profile typically
affects..." uses unnecessary emphatic adverbs ("firmly," "explicitly"); replace
that sentence with a concise version that removes those adverbs while keeping
the meaning: state that updates only apply to newly configured workloads and
that to apply changes to running deployments you must edit or redeploy the
corresponding inference service (refer to the sentence starting "Note: Updating
a hardware profile..." and the phrase "inference service" to locate the text).
- Around line 13-15: The three prerequisite bullets all start with "You have" or
"You are," causing repetitive phrasing; update the three lines ("You have logged
in to the platform as a user with administrator privileges.", "You have verified
your desired computing resources, including CPU, memory, and any specialized
accelerators (e.g., GPU models) available in the underlying Kubernetes
cluster.", "You are familiar with Kubernetes scheduling concepts such as Node
Selectors, Taints, and Tolerations.") to vary sentence openings—for example,
change one to an imperative ("Log in to the platform as an administrator"),
another to a noun phrase ("Verified computing resources: CPU, memory, and any
accelerators (e.g., GPUs) on the cluster"), and the third to a short descriptor
("Familiarity with Kubernetes scheduling concepts (Node Selectors, Taints,
Tolerations)"), ensuring meaning is preserved and tone remains consistent with
the rest of the document.

In
`@docs/en/infrastructure_management/hardware_profile/how_to/create_hardware_profile_cli.mdx`:
- Line 66: The kubectl command in create_hardware_profile_cli.mdx redundantly
includes the namespace flag; update the line containing "kubectl apply -f
gpu-high-performance-profile.yaml -n kube-public" to remove the "-n kube-public"
portion so the command relies on the manifest's metadata namespace instead.
- Around line 15-17: The three prerequisite bullets in
create_hardware_profile_cli.mdx all start with "You have," creating repetitive
phrasing; change the three bullets to vary sentence structure by using different
sentence starters (e.g., "Access a Kubernetes cluster with the platform
installed," "Configure kubectl to communicate with your cluster," "Ensure you
have permissions in the namespace where HardwareProfile resources are managed"),
preserve the meaning and examples (like kube-public) and keep them concise and
parallel in tone.

In `@docs/en/model_inference/inference_service/functions/inference_service.mdx`:
- Line 99: Replace the unhyphenated phrase "cloud native inference services"
with the hyphenated compound adjective "cloud-native inference services" in the
sentence that begins "AML completes the release and operation of cloud native
inference services..." (look for the phrase "cloud native inference services" in
inference_service.mdx) so the compound adjective correctly modifies "inference
services."
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 36fdd26d-7a48-4777-8cb8-ad54e88ff4ae

📥 Commits

Reviewing files that changed from the base of the PR and between 5838417 and e76448a.

📒 Files selected for processing (3)
  • docs/en/infrastructure_management/hardware_profile/functions/hardware_profile.mdx
  • docs/en/infrastructure_management/hardware_profile/how_to/create_hardware_profile_cli.mdx
  • docs/en/model_inference/inference_service/functions/inference_service.mdx


When users (such as data scientists, AI engineers, and developers) dynamically create or configure model inference services (both `InferenceService` and `LLMInferenceService`), they can leverage predefined hardware profiles efficiently.

A hardware profile seamlessly streamlines the tedious task of manually configuring intricate node scheduling rules and setting explicit resource limitations. Depending on your workload specifics, you have the flexibility to accept the strict default configurations or finely customize your limits within the officially boundaries authorized by the selected profile.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix typo and simplify verbose phrasing.

Line 107 contains a typo ("officially boundaries" should be "official boundaries") and uses verbose phrasing ("seamlessly streamlines the tedious task," "intricate," "finely customize") that can be simplified.

✏️ Suggested rewording
-A hardware profile seamlessly streamlines the tedious task of manually configuring intricate node scheduling rules and setting explicit resource limitations. Depending on your workload specifics, you have the flexibility to accept the strict default configurations or finely customize your limits within the officially boundaries authorized by the selected profile.
+A hardware profile streamlines the task of manually configuring node scheduling rules and resource limitations. Depending on your workload requirements, you can accept the default configurations or customize your limits within the official boundaries authorized by the selected profile.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@docs/en/infrastructure_management/hardware_profile/functions/hardware_profile.mdx`
at line 107, Fix the typo "officially boundaries" to "official boundaries" and
simplify the verbose sentence at line containing "A hardware profile seamlessly
streamlines the tedious task..." by replacing it with a clearer, shorter
sentence (for example: "A hardware profile simplifies node scheduling and
resource limits; you can use the strict defaults or adjust limits within the
selected profile's official boundaries."). Update the sentence in
hardware_profile.mdx where that exact phrasing appears.

gpu-high-performance-profile 2m
```

Once correctly applied, your Data Scientists will be able to select **GPU High Performance** when deploying their Inference Services using the UI, and the constraints specified in the profile will automatically validate and inject into the deployed workloads.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Check if HardwareProfile CRD or examples include displayName field

# Search for displayName in HardwareProfile context
rg -n "displayName" docs/en/infrastructure_management/hardware_profile/ -B 2 -A 2

# Search for any UI display name documentation
rg -n "display.*name|Display.*Name" docs/en/infrastructure_management/hardware_profile/ -i

Repository: alauda/aml-docs

Length of output: 6108


🏁 Script executed:

#!/bin/bash
# Search for profile metadata structure and any displayName at profile level
rg -n "metadata:|name:|spec:" docs/en/infrastructure_management/hardware_profile/how_to/create_hardware_profile_cli.mdx -A 3 -B 1 | head -50

# Also check the full YAML structure of the example profile around line 84 context
sed -n '15,90p' docs/en/infrastructure_management/hardware_profile/how_to/create_hardware_profile_cli.mdx

Repository: alauda/aml-docs

Length of output: 2406


Clarify or document the profile display name in the UI.

The line states users will select "GPU High Performance" in the UI, but the YAML shows the profile name as gpu-high-performance-profile. The current example does not include a displayName field at the profile level (only within resource identifiers like GPU, CPU, Memory). Either add a displayName field to the HardwareProfile spec/metadata in the example and explain its purpose, or document how the UI transforms the profile name for display.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@docs/en/infrastructure_management/hardware_profile/how_to/create_hardware_profile_cli.mdx`
at line 84, The doc currently references selecting "GPU High Performance" in the
UI but the example HardwareProfile uses the name gpu-high-performance-profile
and has no profile-level displayName; update the example HardwareProfile (the
HardwareProfile spec/metadata) to include a profile-level displayName (e.g.,
displayName: "GPU High Performance") and add one sentence explaining that
displayName is shown in the UI while the metadata.name
(gpu-high-performance-profile) is the resource identifier, or alternatively
explain how the UI derives a human-friendly label from metadata.name if you
prefer not to add displayName; refer to HardwareProfile, metadata.name
(gpu-high-performance-profile) and displayName in the change.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/en/model_inference/inference_service/functions/inference_service.mdx`:
- Line 241: The table entry for the parameter `num_inference_steps` uses "higher
quality" but should use the hyphenated compound adjective "higher-quality";
update the text in the `num_inference_steps` row to read "higher-quality images"
so the documentation uses correct compound-adjective punctuation.
- Line 101: Update the sentence that currently reads "AML completes the release
and operation of cloud native inference services..." to use the hyphenated
compound adjective "cloud-native" (so it reads "cloud-native inference
services") in the inference_service.mdx content; locate the sentence containing
"kserve InferenceService CRD" / "cloud native inference services" and replace
"cloud native" with "cloud-native".
- Line 230: The table cell for the `eta_cutoff` description contains malformed
syntax: replace the `_ exp(...)` fragment with `* exp(...)` to correctly
indicate multiplication and remove the broken lint comment `{/_ lint ignore
unit-case \*/}` (or convert it to a valid markdown comment or proper lint
pragma) so the sentence reads "...greater than `eta_cutoff` or
sqrt(`eta_cutoff`) * exp(-entropy(softmax(next_token_logits))). Suggested values
range from 3E-4 to 2E-3..." and ensure the `eta_cutoff` row text is valid
Markdown.
- Around line 116-117: The table in inference_service.mdx uses the label "Config
Source" but the related hardware_profile.mdx uses "Config Type"; update
inference_service.mdx to use the same term ("Config Type") to match
hardware_profile.mdx everywhere this dropdown is described (e.g., the table row
currently labeled "Config Source" and any other occurrences in the Deployment
Resources section), and verify the Hardware Profile row text still reads
consistently when the label is changed.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 55342d83-3b8b-403d-9f72-0f2b15c6ebae

📥 Commits

Reviewing files that changed from the base of the PR and between e76448a and 460bd33.

📒 Files selected for processing (1)
  • docs/en/model_inference/inference_service/functions/inference_service.mdx


**Note:** The inference service will automatically scale up and down between the "minimum number of replicas" and "maximum number of replicas" according to the request traffic. If the "minimum number of replicas" is set to 0, the inference service will automatically pause and release resources when there is no request for a period of time. At this time, if a request comes, the inference service can automatically start and load the model cached in the PVC.

AML completes the release and operation of cloud native inference services based on [kserve](https://github.com/kserve/kserve) InferenceService CRD. If you are familiar with the use of kserve, you can also click the "YAML" button in the upper right corner when "Publish inference service directly from model" to directly modify the YAML file to complete more advanced release operations.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use "cloud-native" (hyphenated) as a compound adjective.

When used as a compound adjective before a noun, "cloud-native" should be hyphenated.

📝 Suggested fix
-AML completes the release and operation of cloud native inference services based on [kserve](https://github.com/kserve/kserve) InferenceService CRD.
+AML completes the release and operation of cloud-native inference services based on [kserve](https://github.com/kserve/kserve) InferenceService CRD.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
AML completes the release and operation of cloud native inference services based on [kserve](https://github.com/kserve/kserve) InferenceService CRD. If you are familiar with the use of kserve, you can also click the "YAML" button in the upper right corner when "Publish inference service directly from model" to directly modify the YAML file to complete more advanced release operations.
AML completes the release and operation of cloud-native inference services based on [kserve](https://github.com/kserve/kserve) InferenceService CRD. If you are familiar with the use of kserve, you can also click the "YAML" button in the upper right corner when "Publish inference service directly from model" to directly modify the YAML file to complete more advanced release operations.
🧰 Tools
🪛 LanguageTool

[grammar] ~101-~101: Use a hyphen to join words.
Context: ...letes the release and operation of cloud native inference services based on [kser...

(QB_NEW_EN_HYPHEN)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/model_inference/inference_service/functions/inference_service.mdx` at
line 101, Update the sentence that currently reads "AML completes the release
and operation of cloud native inference services..." to use the hyphenated
compound adjective "cloud-native" (so it reads "cloud-native inference
services") in the inference_service.mdx content; locate the sentence containing
"kserve InferenceService CRD" / "cloud native inference services" and replace
"cloud native" with "cloud-native".

Comment on lines +116 to +117
| Config Source | Required, Choose the source for deploying resources: `Hardware profile` or `Custom`. |
| Hardware Profile | Required when Config Source is `Hardware profile`. Select a predefined hardware profile that configures the required resources. |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for "Config Source" and "Config Type" usage in documentation
echo "=== Files using 'Config Source' ==="
rg -n "Config Source" --type=md --type=mdx -g '*.mdx' || rg -n "Config Source" -g '*.md' -g '*.mdx'

echo ""
echo "=== Files using 'Config Type' ==="
rg -n "Config Type" --type=md --type=mdx -g '*.mdx' || rg -n "Config Type" -g '*.md' -g '*.mdx'

Repository: alauda/aml-docs

Length of output: 2505


🏁 Script executed:

cat -n docs/en/infrastructure_management/hardware_profile/functions/hardware_profile.mdx | sed -n '105,125p'

Repository: alauda/aml-docs

Length of output: 2738


Terminology mismatch with hardware profile documentation.

This file uses "Config Source" (Line 116) while docs/en/infrastructure_management/hardware_profile/functions/hardware_profile.mdx uses "Config Type" (line 115) to describe the same dropdown field for choosing between "Hardware Profile" and "Custom" resource configuration in the Deployment Resources section.

This inconsistency will confuse users who navigate between these related documentation pages. Please align the terminology across both files.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/model_inference/inference_service/functions/inference_service.mdx`
around lines 116 - 117, The table in inference_service.mdx uses the label
"Config Source" but the related hardware_profile.mdx uses "Config Type"; update
inference_service.mdx to use the same term ("Config Type") to match
hardware_profile.mdx everywhere this dropdown is described (e.g., the table row
currently labeled "Config Source" and any other occurrences in the Deployment
Resources section), and verify the Hardware Profile row text still reads
consistently when the label is changed.

| `penalty_alpha` | float | Contrastive search is enabled when `penalty_alpha` is greater than 0 and `top_k` is greater than 1. The larger the `penalty_alpha` value, the stronger the contrastive penalty, and the more likely the generated text is to meet expectations. If the `penalty_alpha` value is set too large, it may cause the generated text to be too uniform. |
| `typical_p` | float | Local typicality measures the similarity between the conditional probability of predicting the next target token and the expected conditional probability of predicting the next random token given the generated partial text. If set to a floating-point number less than 1, the smallest set of locally typical tokens whose probabilities add up to or exceed `typical_p` will be retained for generation. |
| `epsilon_cutoff` | float | If set to a floating-point number strictly between 0 and 1, only tokens with conditional probabilities greater than `epsilon_cutoff` will be sampled. Suggested values range from {/* lint ignore unit-case */} 3e-4 to 9e-4, depending on the model size. |
| `eta_cutoff` | float | Eta sampling is a hybrid of local typical sampling and epsilon sampling. If set to a floating-point number strictly between 0 and 1, a token will only be considered if it is greater than `eta_cutoff` or sqrt(`eta_cutoff`) _ exp(-entropy(softmax(next_token_logits))). Suggested values range from {/_ lint ignore unit-case \*/} 3E-4 to 2E-3, depending on the model size. |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Malformed markdown/lint comment syntax.

The text contains what appears to be corrupted or malformed syntax:

  • _ exp(...) - likely should be * exp(...) (multiplication)
  • {/_ lint ignore unit-case \*/} - malformed lint ignore comment

This will likely render incorrectly for users.

📝 Suggested fix
-| `eta_cutoff`         | float     | Eta sampling is a hybrid of local typical sampling and epsilon sampling. If set to a floating-point number strictly between 0 and 1, a token will only be considered if it is greater than `eta_cutoff` or sqrt(`eta_cutoff`) _ exp(-entropy(softmax(next_token_logits))). Suggested values range from {/_ lint ignore unit-case \*/} 3E-4 to 2E-3, depending on the model size.                               |
+| `eta_cutoff`         | float     | Eta sampling is a hybrid of local typical sampling and epsilon sampling. If set to a floating-point number strictly between 0 and 1, a token will only be considered if it is greater than `eta_cutoff` or sqrt(`eta_cutoff`) * exp(-entropy(softmax(next_token_logits))). Suggested values range from {/* lint ignore unit-case */} 3E-4 to 2E-3, depending on the model size.                               |
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| `eta_cutoff` | float | Eta sampling is a hybrid of local typical sampling and epsilon sampling. If set to a floating-point number strictly between 0 and 1, a token will only be considered if it is greater than `eta_cutoff` or sqrt(`eta_cutoff`) _ exp(-entropy(softmax(next_token_logits))). Suggested values range from {/_ lint ignore unit-case \*/} 3E-4 to 2E-3, depending on the model size. |
| `eta_cutoff` | float | Eta sampling is a hybrid of local typical sampling and epsilon sampling. If set to a floating-point number strictly between 0 and 1, a token will only be considered if it is greater than `eta_cutoff` or sqrt(`eta_cutoff`) * exp(-entropy(softmax(next_token_logits))). Suggested values range from {/* lint ignore unit-case */} 3E-4 to 2E-3, depending on the model size. |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/model_inference/inference_service/functions/inference_service.mdx` at
line 230, The table cell for the `eta_cutoff` description contains malformed
syntax: replace the `_ exp(...)` fragment with `* exp(...)` to correctly
indicate multiplication and remove the broken lint comment `{/_ lint ignore
unit-case \*/}` (or convert it to a valid markdown comment or proper lint
pragma) so the sentence reads "...greater than `eta_cutoff` or
sqrt(`eta_cutoff`) * exp(-entropy(softmax(next_token_logits))). Suggested values
range from 3E-4 to 2E-3..." and ensure the `eta_cutoff` row text is valid
Markdown.

| `use_cache` | bool | Whether to use the intermediate results calculated by the model during the generation process. |
| Parameter | Data Type | Description |
| --------------------- | --------- | ----------------------------------------------------------------------------------------------------------------- |
| `num_inference_steps` | int | The number of denoising steps. More denoising steps usually result in higher quality images but slower inference. |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use "higher-quality" (hyphenated) as a compound adjective.

📝 Suggested fix
-| `num_inference_steps` | int       | The number of denoising steps. More denoising steps usually result in higher quality images but slower inference. |
+| `num_inference_steps` | int       | The number of denoising steps. More denoising steps usually result in higher-quality images but slower inference. |
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| `num_inference_steps` | int | The number of denoising steps. More denoising steps usually result in higher quality images but slower inference. |
| `num_inference_steps` | int | The number of denoising steps. More denoising steps usually result in higher-quality images but slower inference. |
🧰 Tools
🪛 LanguageTool

[grammar] ~241-~241: Use a hyphen to join words.
Context: ...denoising steps usually result in higher quality images but slower inference. | |...

(QB_NEW_EN_HYPHEN)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/model_inference/inference_service/functions/inference_service.mdx` at
line 241, The table entry for the parameter `num_inference_steps` uses "higher
quality" but should use the hyphenated compound adjective "higher-quality";
update the text in the `num_inference_steps` row to read "higher-quality images"
so the documentation uses correct compound-adjective punctuation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants