Add how to upload models using notebook#47
Conversation
WalkthroughAdds a new how-to doc for uploading models from a Notebook/Workbench to a Model Repository using Git and Git LFS, and replaces the in-file procedural "Create Model Repository" section with a link to that how-to. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Notebook
participant LocalGit as "Local Git repo\n(+ Git LFS)"
participant ModelRepo as "Model Repository\n(remote, Git endpoint)"
Note over Notebook,LocalGit: Setup in Notebook / Workbench
Notebook->>LocalGit: fetch model files (huggingface/modelscope/mirror)
Notebook->>LocalGit: configure Git LFS (.gitattributes, git lfs install)
Notebook->>LocalGit: git init / add / commit
LocalGit->>ModelRepo: git remote add / push (LFS objects transferred)
alt push succeeds
ModelRepo-->>Notebook: repo URL / success
Notebook->>ModelRepo: update metadata (task, framework) & publish
else push fails
ModelRepo-->>Notebook: error
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Potential review focus:
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/en/model_inference/model_management/how_to/upload_models_using_notebook.md(1 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.18.1)
docs/en/model_inference/model_management/how_to/upload_models_using_notebook.md
140-140: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
148-148: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
docs/en/model_inference/model_management/how_to/upload_models_using_notebook.md
Outdated
Show resolved
Hide resolved
docs/en/model_inference/model_management/how_to/upload_models_using_notebook.md
Outdated
Show resolved
Hide resolved
docs/en/model_inference/model_management/how_to/upload_models_using_notebook.md
Outdated
Show resolved
Hide resolved
docs/en/model_inference/model_management/how_to/upload_models_using_notebook.mdx
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (9)
docs/en/model_inference/model_management/how_to/upload_models_using_notebook.mdx (9)
51-59: Clarify directory placeholder and remote name“” is ambiguous here; this step should cd into the downloaded model folder, and the remote should reference the copied Repository Address.
-# Navigate to the folder where you downloaded the model in the previous step. -cd <your-repo-name> +# Navigate to the folder where you downloaded the model in the previous step. +cd <downloaded-model-directory> @@ -git remote add origin <repository-url> +git remote add origin <repository-address>
101-104: Avoid duplicating LFS patterns in .gitattributesPresent the
git lfs trackmethod as an alternative to the manual.gitattributesblock to prevent duplicate entries.-# You can also add or modify manually .gitattributes file, for example: -# Track files with the specified suffix -git lfs track "*.h5" "*.bin" "*.pt" +# Alternative: instead of manually writing `.gitattributes` above, you can ask Git LFS to add patterns for you. +# (Use either the `.gitattributes` block above or this command—do not run both.) +git lfs track "*.h5" "*.bin" "*.pt"
13-15: Tighten wording on instance creation and storageMinor phrasing improvements for clarity.
-The detailed workbench/notebook creation instructions are not detailed here. Please refer to workbench docs. -You need to note that sufficient storage space must be created to store the model file for the upload process to complete successfully. +The detailed Workbench/Notebook creation steps are covered in the Workbench docs. +Ensure enough storage is allocated to hold the model files; otherwise the upload will fail.
18-23: Clean up sources section and example linkRemove the dangling “such as …” and list sources first; add the example separately.
-Download the required model from any open source community. We recommend downloading from the following three websites, such as [https://hf-mirror.com/deepseek-ai/DeepSeek-R1]. - -* [https://huggingface.co/](https://huggingface.co/) -* [https://hf-mirror.com](https://hf-mirror.com) -* [https://modelscope.cn/home](https://modelscope.cn/home) +Download the required model from a trusted source. Common sources include: + +* [https://huggingface.co/](https://huggingface.co/) +* [https://hf-mirror.com](https://hf-mirror.com) +* [https://modelscope.cn/home](https://modelscope.cn/home) +Example: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B on Hugging Face or hf-mirror.
24-30: Mention authentication when requiredSome models require accepting terms or a token; suggest
huggingface-cli login.export HF_ENDPOINT=https://hf-mirror.com pip install huggingface_hub +huggingface-cli login # if the model requires authentication or terms acceptance huggingface-cli download --resume-download deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --local-dir DeepSeek-R1-Distill-Qwen-1.5B
32-33: Fix grammar in the offline-noteSmall copy edit.
-> **Note:** If your environment doesn't have internet access, you can choose find a suitable machine with internet access (such as a desktop or server with a high-speed connection to the cluster), download the model, and then copy it to the Notebook environment. +> **Note:** If your environment doesn't have internet access, find a suitable machine with internet access (for example, a desktop or a server with a fast connection to the cluster), download the model, and then copy it into the Notebook environment.
118-118: Prefer-m(not-am) for the initial commitYou already staged files;
-ais redundant and can confuse beginners.-git commit -am "Add LLM model files with Git LFS" +git commit -m "Add LLM model files with Git LFS"
123-125: Use--force-with-leasefor safer rewritesRecommend the safer force variant and keep the caution.
-# If you need to force a push, for example after using git lfs migrate --import: -# git push -u origin main --force +# If you need to force-push (for example after `git lfs migrate import`), prefer a safer variant: +# git push -u origin main --force-with-lease
5-5: Capitalize “Notebook” in the titleConsistency with section headings.
-# Upload models using notebook +# Upload models using Notebook
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/en/model_inference/model_management/how_to/upload_models_using_notebook.mdx(1 hunks)
🔇 Additional comments (1)
docs/en/model_inference/model_management/how_to/upload_models_using_notebook.mdx (1)
120-122: Remove TLS bypass; keep certificate verification onDisabling TLS verification invites MITM. Instruct users to trust the cluster CA and push with verification enabled.
-# Push to the remote repository -git -c http.sslVerify=false -c lfs.activitytimeout=36000 push -u origin main +# Push to the remote repository +# If your Git server uses a private CA, install/trust it locally (do not disable verification). +# Example: +# wget -O /tmp/alauda-ca.crt <cluster-ca-url> +# git config http.sslCAInfo /tmp/alauda-ca.crt +git -c lfs.activitytimeout=36000 push -u origin main
| After the model repository is created, you can find the model's "Repository Address" on the "Details" page. This will be used for subsequent git builds. Use when pushing | ||
|
|
There was a problem hiding this comment.
Fix incomplete sentence and clarify usage of Repository Address
The sentence trails off; clarify it’s the Git remote URL used for pushing.
-After the model repository is created, you can find the model's "Repository Address" on the "Details" page. This will be used for subsequent git builds. Use when pushing
+After the model repository is created, open the model’s “Details” page and copy the “Repository Address.” You will use it as the Git remote URL in the next steps when pushing.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| After the model repository is created, you can find the model's "Repository Address" on the "Details" page. This will be used for subsequent git builds. Use when pushing | |
| After the model repository is created, open the model’s “Details” page and copy the “Repository Address.” You will use it as the Git remote URL in the next steps when pushing. |
🤖 Prompt for AI Agents
In
docs/en/model_inference/model_management/how_to/upload_models_using_notebook.mdx
around lines 42-43, the sentence "Use when pushing" is incomplete and ambiguous;
update it to state that the model's "Repository Address" shown on the Details
page is the Git remote URL to use as the remote when pushing changes for
subsequent git builds (e.g., "Use this Repository Address as the Git remote URL
when pushing for subsequent git builds.").
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (4)
docs/en/model_inference/model_management/how_to/upload_models_using_notebook.mdx (4)
13-14: Improve clarity and grammar.Two issues need attention:
- Line 13: The phrase "detailed workbench/notebook creation instructions are not detailed here" is redundant.
- Line 14: "sufficient storage space must be created" is unclear; storage is allocated or provisioned, not "created."
Apply this diff to improve clarity:
-The detailed workbench/notebook creation instructions are not detailed here. Please refer to workbench docs. -You need to note that sufficient storage space must be created to store the model file for the upload process to complete successfully. +For detailed workbench/notebook creation instructions, please refer to the workbench documentation. +Ensure sufficient storage space is allocated to store the model files for the upload process to complete successfully.
18-18: Fix redundant phrasing.The phrase "from the following three websites, such as" is redundant. Either introduce the list directly or provide a single example, but not both.
Apply this diff:
-Download the required model from any open source community. We recommend downloading from the following three websites, such as https://hf-mirror.com/deepseek-ai/DeepSeek-R1. +Download the required model from any open source community. We recommend downloading from the following websites (for example, https://hf-mirror.com/deepseek-ai/DeepSeek-R1):
55-55: Fix grammar in comment.The comment "Initialization Create a git repository" has awkward grammar.
Apply this diff:
-# Initialization Create a git repository and set the push URL to the model repository created in the previous step. +# Initialize a git repository and set the push URL to the model repository created in the previous step.
60-61: Improve comment clarity.The phrase "The following file identifies" is unclear. The
.gitattributesfile is a configuration, not a file that "identifies."Apply this diff:
-# In the .gitattributes file, specify the file types to tell Git LFS which files to track. -# The following file identifies common model file formats and can be used directly. +# Create a .gitattributes file to specify which file types Git LFS should track. +# The following configuration covers common model file formats and can be used directly.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
yarn.lockis excluded by!**/yarn.lock,!**/*.lock
📒 Files selected for processing (2)
docs/en/model_inference/model_management/how_to/upload_models_using_notebook.mdx(1 hunks)package.json(1 hunks)
🔇 Additional comments (3)
docs/en/model_inference/model_management/how_to/upload_models_using_notebook.mdx (2)
42-43: Fix incomplete sentence about Repository Address usage.The sentence "Use when pushing" is incomplete and unclear. It should explain that the Repository Address is used as the Git remote URL in the upload steps.
Apply this diff to complete the sentence:
-After the model repository is created, you can find the model's "Repository Address" on the "Details" page. This will be used for subsequent git builds. Use when pushing +After the model repository is created, you can find the model's "Repository Address" on the "Details" page. This will be used as the Git remote URL when pushing in the next steps.
120-122: Remove TLS-bypass flag for security.The command instructs users to disable certificate verification with
http.sslVerify=false, which exposes them to man-in-the-middle attacks. Users should install and trust the cluster CA certificate instead.Apply this diff to remove the insecure flag:
-# Push to the remote repository -git -c http.sslVerify=false -c lfs.activitytimeout=36000 push -u origin main +# Push to the remote repository +# If your Git server uses a private CA, install it locally and configure Git to trust it: +# Example: +# wget -O /tmp/alauda-ca.crt <cluster-ca-url> +# git config http.sslCAInfo /tmp/alauda-ca.crt +git -c lfs.activitytimeout=36000 push -u origin mainpackage.json (1)
3-3: Version bump verified. @alauda/doom@1.13.2 exists (latest) and no security advisories were detected.
b9cfc2a to
bc6e4f2
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (5)
docs/en/model_inference/model_management/how_to/upload_models_using_notebook.mdx (5)
13-14: Improve phrasing for clarity.The current phrasing is awkward and can be improved for better readability.
Apply this diff:
-The detailed workbench/notebook creation instructions are not detailed here. Please refer to workbench docs. -You need to note that sufficient storage space must be created to store the model file for the upload process to complete successfully. +For detailed workbench/notebook creation instructions, please refer to the workbench documentation. +Ensure that sufficient storage space is allocated to store the model files for a successful upload.
55-55: Fix grammar in comment.The comment has a grammatical error.
Apply this diff:
-# Initialization Create a git repository and set the push URL to the model repository created in the previous step. +# Initialize a git repository and set the push URL to the model repository created in the previous step.
60-62: Clarify wording in comment.The comment should clarify that it's the "content" (not "file") being shown.
Apply this diff:
-# In the .gitattributes file, specify the file types to tell Git LFS which files to track. -# The following file identifies common model file formats and can be used directly. +# In the .gitattributes file, specify which file types Git LFS should track. +# The following content identifies common model file formats and can be used directly.
140-140: Format as a proper heading.For consistency with the rest of the document, this should be formatted as a heading.
Apply this diff:
-Checking the files that need to be migrated +#### Checking the files that need to be migrated
146-146: Format as a proper heading.For consistency with the rest of the document, this should be formatted as a heading.
Apply this diff:
-Migrate existing large files to LFS: +#### Migrate existing large files to LFS
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/en/model_inference/model_management/how_to/upload_models_using_notebook.mdx(1 hunks)
🔇 Additional comments (2)
docs/en/model_inference/model_management/how_to/upload_models_using_notebook.mdx (2)
42-43: Fix incomplete sentence about Repository Address usage.The sentence "Use when pushing" is incomplete and unclear. This issue was raised in a previous review but remains unaddressed.
Apply this diff:
-After the model repository is created, you can find the model's "Repository Address" on the "Details" page. This will be used for subsequent git builds. Use when pushing +After the model repository is created, you can find the model's "Repository Address" on the "Details" page. You will use this address as the Git remote URL when pushing the model files in the next steps.
120-122: Remove the TLS-bypass flag from the push command (Security Issue).The
-c http.sslVerify=falseflag disables SSL certificate verification, exposing users to man-in-the-middle attacks. This critical security issue was raised in a previous review but remains unaddressed.Apply this diff:
# Push to the remote repository -git -c http.sslVerify=false -c lfs.activitytimeout=36000 push -u origin main +# If your Git server uses a private CA, install it locally or configure Git to trust it. +# Do not disable SSL verification. Example: +# git config http.sslCAInfo /path/to/ca-certificate.crt +git -c lfs.activitytimeout=36000 push -u origin main
| # git push -u origin main --force | ||
| ``` | ||
|
|
||
| ## Editing Model Metadata |
There was a problem hiding this comment.
Describe how to edit metadata, especially how to set up the Framework, as this is the most frequently asked question in projects.
|
这篇guide有了,是不是可以把“原模型管理”中的重复内容删了?加个链接引用到这篇文档里? |
内容可以的话我改一下。原先的文档没有提notebook,也存在很多推送的选项。这里就只提供固定的方法了,减少问题。 |
内容没问题,可以的话,把edit metadata的方式加一下,这块用户的使用问题较多。 |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
docs/en/model_inference/model_management/how_to/upload_models_using_notebook.mdx (2)
52-53: Fix incomplete sentence and clarify Repository Address usage.The sentence trails off; complete it by clarifying that the Repository Address is the Git remote URL for subsequent steps.
-After the model repository is created, you can find the model's "Repository Address" on the "Details" page. This will be used for subsequent git builds. Use when pushing +After the model repository is created, open the model's "Details" page and copy the "Repository Address." You will use it as the Git remote URL in the next steps when pushing.
130-132: CRITICAL: Remove the TLS-bypass flag from the push command.Instructing users to disable certificate verification (
http.sslVerify=false) duringgit pushexposes them to man-in-the-middle (MITM) attacks. Users should instead trust the cluster's certificate authority. This is a security issue flagged in previous review and must be fixed before merge.-# Push to the remote repository -git -c http.sslVerify=false -c lfs.activitytimeout=36000 push -u origin main +# Push to the remote repository +# If your Git server uses a private CA, install it locally and point Git at it. +# Example: git config http.sslCAInfo /path/to/ca-cert.crt +# Do not disable certificate verification—it creates security vulnerabilities. +git -c lfs.activitytimeout=36000 push -u origin main
🧹 Nitpick comments (1)
docs/en/model_inference/model_management/how_to/upload_models_using_notebook.mdx (1)
137-151: Metadata editing section well-added, but fix repetitive sentence structure.The new metadata editing section addresses the frequently requested feature about setting task type and framework. However, reduce repetition of sentence beginnings with "For" in the bullet points. The guidance is clear and helpful for users.
- Task Type: Determines how inference services will be created based on the model (e.g., text-generation, image-classification). - For LLM models, select "text-generation". - For other models downloaded from Huggingface, refer to the task type listed on the model's page. - For custom models, select the task type based on the model's intended use. - Framework: The format that the model stores. - For models Downloaded from Huggingface, select "Transformers". - For models saved as [MLFlow format](https://mlflow.org/docs/latest/ml/model/), select "MLFlow". - For models saved as [Triton Inference Server format](https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/user_guide/model_configuration.html), select "Triton". + Task Type: Determines how inference services will be created based on the model (e.g., text-generation, image-classification). + - LLM models: select "text-generation". + - Other models from Huggingface: refer to the task type listed on the model's page. + - Custom models: select the task type based on the model's intended use. + Framework: The format that the model stores. + - Huggingface models: select "Transformers". + - [MLFlow format](https://mlflow.org/docs/latest/ml/model/) models: select "MLFlow". + - [Triton Inference Server format](https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/user_guide/model_configuration.html) models: select "Triton".
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
docs/en/model_inference/model_management/functions/model_repository.mdx(1 hunks)docs/en/model_inference/model_management/how_to/upload_models_using_notebook.mdx(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- docs/en/model_inference/model_management/functions/model_repository.mdx
🧰 Additional context used
🪛 LanguageTool
docs/en/model_inference/model_management/how_to/upload_models_using_notebook.mdx
[grammar] ~26-~26: Use a hyphen to join words.
Context: ...ownload the required model from any open source community. We recommend downloadi...
(QB_NEW_EN_HYPHEN)
[style] ~146-~146: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...sk type listed on the model's page. - For custom models, select the task type bas...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~150-~150: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...latest/ml/model/), select "MLFlow". - For models saved as [Triton Inference Serve...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
🔇 Additional comments (2)
docs/en/model_inference/model_management/how_to/upload_models_using_notebook.mdx (2)
1-23: Good introductory framing and clear prerequisites.The introduction effectively communicates why using Notebook is preferable, and appropriately defers detailed Workbench creation to other docs. The storage space note is helpful for users planning their setup.
153-172: Appendix is well-structured with appropriate warnings.The migration section appropriately cautions users about history rewriting and the need to coordinate with collaborators. The command examples are clear and the guidance about using
--forcewhen pushing is appropriate.
|
|
||
| ## Preparing the Model | ||
|
|
||
| Download the required model from any open source community. We recommend downloading from the following three websites, such as https://hf-mirror.com/deepseek-ai/DeepSeek-R1. |
There was a problem hiding this comment.
Fix hyphenation in "open-source."
Compound adjectives before a noun should be hyphenated for grammatical correctness.
-Download the required model from any open source community. We recommend downloading from the following three websites, such as https://hf-mirror.com/deepseek-ai/DeepSeek-R1.
+Download the required model from any open-source community. We recommend downloading from the following three websites, such as https://hf-mirror.com/deepseek-ai/DeepSeek-R1.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Download the required model from any open source community. We recommend downloading from the following three websites, such as https://hf-mirror.com/deepseek-ai/DeepSeek-R1. | |
| Download the required model from any open-source community. We recommend downloading from the following three websites, such as https://hf-mirror.com/deepseek-ai/DeepSeek-R1. |
🧰 Tools
🪛 LanguageTool
[grammar] ~26-~26: Use a hyphen to join words.
Context: ...ownload the required model from any open source community. We recommend downloadi...
(QB_NEW_EN_HYPHEN)
🤖 Prompt for AI Agents
In
docs/en/model_inference/model_management/how_to/upload_models_using_notebook.mdx
around line 26, the phrase "open source community" should be hyphenated as
"open-source community"; update the sentence accordingly (e.g., "Download the
required model from any open-source community. We recommend downloading from the
following three websites, such as
https://hf-mirror.com/deepseek-ai/DeepSeek-R1.") to fix the compound adjective
hyphenation.
Deploying alauda-ai with
|
| Latest commit: |
7f1a9ba
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://def4a739.alauda-ai.pages.dev |
| Branch Preview URL: | https://upload-models.alauda-ai.pages.dev |
Summary by CodeRabbit