Skip to content

Adds a Jupyter notebook tutorial#213

Merged
diptorupd merged 5 commits intoROCm:amd-integrationfrom
diptorupd:demo/jupyter-notebook
Apr 14, 2026
Merged

Adds a Jupyter notebook tutorial#213
diptorupd merged 5 commits intoROCm:amd-integrationfrom
diptorupd:demo/jupyter-notebook

Conversation

@diptorupd
Copy link
Copy Markdown
Collaborator

A Jupyter notebook demo on how to use amd-flashinfer

Copilot AI review requested due to automatic review settings April 3, 2026 20:53
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a ROCm-focused Jupyter notebook tutorial to demonstrate using amd-flashinfer (module flashinfer) for runtime validation (hip_utils), AITER-backed prefill attention, and logits_processor pipelines, plus a helper script to launch JupyterLab from the repo.

Changes:

  • Document the new tutorial notebook and Jupyter launcher in the README examples list.
  • Add examples/run_jupyter_server.sh to start JupyterLab from the repo root and auto-install jupyterlab if missing.
  • Add examples/amd_flashinfer_rocm_tutorial.ipynb with end-to-end runnable tutorial cells (runtime checks, prefill, logits processing).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
README.md Adds new “Available examples” entries for the tutorial notebook and Jupyter launcher.
examples/run_jupyter_server.sh Introduces a convenience script to launch JupyterLab from the repository root.
examples/amd_flashinfer_rocm_tutorial.ipynb New tutorial notebook covering ROCm environment verification, AITER-backed prefill, and LogitsPipe usage.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread README.md Outdated
Comment thread examples/run_jupyter_server.sh Outdated
Comment on lines +27 to +32
PORT="${JUPYTER_PORT:-8888}"
IP="${JUPYTER_IP:-0.0.0.0}"

echo "Starting JupyterLab from: $ROOT"
echo " URL: http://127.0.0.1:${PORT}/lab (use SSH -L if remote)"
echo " Stop: Ctrl+C"
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script defaults to --ip=0.0.0.0, which binds JupyterLab on all network interfaces. This is risky on shared machines or when --network=host is used because it can expose the server beyond localhost; prefer defaulting to 127.0.0.1 and require users to explicitly set JUPYTER_IP=0.0.0.0 (or add a prominent warning) when they intend remote access with SSH port-forwarding.

Copilot uses AI. Check for mistakes.
Comment thread examples/amd_flashinfer_rocm_tutorial.ipynb Outdated
Comment thread README.md Outdated
@demandal25 demandal25 self-requested a review April 13, 2026 14:36
Copilot AI review requested due to automatic review settings April 13, 2026 19:02
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +313 to +328
"def reconstruct_seq_from_paged_nhd(kv_tensor, kv_ip, kv_lpl, seq_idx, kv_slot):\n",
" chunks = []\n",
" start = int(kv_ip[seq_idx].item())\n",
" end = int(kv_ip[seq_idx + 1].item())\n",
" last_tokens = int(kv_lpl[seq_idx].item())\n",
" for p in range(start, end - 1):\n",
" chunks.append(kv_tensor[p, kv_slot, :, :, :].reshape(-1, num_kv_heads, head_dim))\n",
" p_last = end - 1\n",
" chunks.append(\n",
" kv_tensor[p_last, kv_slot, :last_tokens, :, :].reshape(-1, num_kv_heads, head_dim)\n",
" )\n",
" return torch.cat(chunks, dim=0)\n",
"\n",
"\n",
"k0 = reconstruct_seq_from_paged_nhd(kv_data, kv_indptr, kv_last_page_len, 0, 0)\n",
"v0 = reconstruct_seq_from_paged_nhd(kv_data, kv_indptr, kv_last_page_len, 0, 1)\n",
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reconstruct_seq_from_paged_nhd reconstructs pages using the raw page index p from kv_indptr, but it ignores kv_indices (the indirection table). This will reconstruct the wrong sequence whenever kv_indices is not the identity mapping (which is the common case in real paged KV caches). Consider passing kv_indices into this helper and using it to map per-seq page slots to global page IDs.

Suggested change
"def reconstruct_seq_from_paged_nhd(kv_tensor, kv_ip, kv_lpl, seq_idx, kv_slot):\n",
" chunks = []\n",
" start = int(kv_ip[seq_idx].item())\n",
" end = int(kv_ip[seq_idx + 1].item())\n",
" last_tokens = int(kv_lpl[seq_idx].item())\n",
" for p in range(start, end - 1):\n",
" chunks.append(kv_tensor[p, kv_slot, :, :, :].reshape(-1, num_kv_heads, head_dim))\n",
" p_last = end - 1\n",
" chunks.append(\n",
" kv_tensor[p_last, kv_slot, :last_tokens, :, :].reshape(-1, num_kv_heads, head_dim)\n",
" )\n",
" return torch.cat(chunks, dim=0)\n",
"\n",
"\n",
"k0 = reconstruct_seq_from_paged_nhd(kv_data, kv_indptr, kv_last_page_len, 0, 0)\n",
"v0 = reconstruct_seq_from_paged_nhd(kv_data, kv_indptr, kv_last_page_len, 0, 1)\n",
"def reconstruct_seq_from_paged_nhd(kv_tensor, kv_ip, kv_indices, kv_lpl, seq_idx, kv_slot):\n",
" chunks = []\n",
" start = int(kv_ip[seq_idx].item())\n",
" end = int(kv_ip[seq_idx + 1].item())\n",
" last_tokens = int(kv_lpl[seq_idx].item())\n",
" for p in range(start, end - 1):\n",
" page_id = int(kv_indices[p].item())\n",
" chunks.append(kv_tensor[page_id, kv_slot, :, :, :].reshape(-1, num_kv_heads, head_dim))\n",
" p_last = end - 1\n",
" page_id_last = int(kv_indices[p_last].item())\n",
" chunks.append(\n",
" kv_tensor[page_id_last, kv_slot, :last_tokens, :, :].reshape(-1, num_kv_heads, head_dim)\n",
" )\n",
" return torch.cat(chunks, dim=0)\n",
"\n",
"\n",
"k0 = reconstruct_seq_from_paged_nhd(kv_data, kv_indptr, kv_indices, kv_last_page_len, 0, 0)\n",
"v0 = reconstruct_seq_from_paged_nhd(kv_data, kv_indptr, kv_indices, kv_last_page_len, 0, 1)\n",

Copilot uses AI. Check for mistakes.
Comment thread examples/run_jupyter_server.sh Outdated
Comment thread examples/run_jupyter_server.sh Outdated
@yaoliu13 yaoliu13 self-requested a review April 13, 2026 21:55
Copy link
Copy Markdown
Collaborator

@yaoliu13 yaoliu13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you for adding the output to the notebook.

Copy link
Copy Markdown
Collaborator

@demandal25 demandal25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

diptorupd and others added 5 commits April 14, 2026 13:30
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Diptorup Deb <diptorup@cs.unc.edu>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Diptorup Deb <diptorup@cs.unc.edu>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Diptorup Deb <diptorup@cs.unc.edu>
Copilot AI review requested due to automatic review settings April 14, 2026 18:30
@diptorupd diptorupd force-pushed the demo/jupyter-notebook branch from 031670f to d6ee47f Compare April 14, 2026 18:31
@diptorupd diptorupd merged commit 8da7342 into ROCm:amd-integration Apr 14, 2026
3 of 4 checks passed
@diptorupd diptorupd deleted the demo/jupyter-notebook branch April 14, 2026 18:33
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +75 to +107
"execution_count": 1,
"id": "0ebe68e6",
"metadata": {
"execution": {
"iopub.execute_input": "2026-04-13T19:55:15.549091Z",
"iopub.status.busy": "2026-04-13T19:55:15.548883Z",
"iopub.status.idle": "2026-04-13T19:55:19.421222Z",
"shell.execute_reply": "2026-04-13T19:55:19.420842Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"[aiter] import [module_aiter_enum] under /home/AMD/diptodeb/micromamba/envs/flashinfer-rocm-devel/lib/python3.12/site-packages/aiter/jit/module_aiter_enum.so\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"flashinfer: 0.5.3+amd.1.dev9\n",
"torch: 2.9.1+rocm7.2.0.git7e1940d4\n",
"PyTorch HIP / ROCm build: 7.2.26015-fc0010cf6a\n",
"Detected system ROCm version: 7.2.0\n",
"Architectures with AMD FlashInfer ports: gfx942, gfx950\n",
"GPU count (torch): 1\n",
"Device indices FlashInfer treats as supported Instinct (rocminfo): (0,)\n",
"Using device: cuda:0\n"
]
}
],
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This notebook is committed with cell outputs and execution metadata (timestamps, stderr logs, absolute paths, etc.). That makes diffs noisy and can unintentionally leak machine-specific information. Please clear all outputs and reset execution counts/metadata before committing (keep only the source/markdown).

Suggested change
"execution_count": 1,
"id": "0ebe68e6",
"metadata": {
"execution": {
"iopub.execute_input": "2026-04-13T19:55:15.549091Z",
"iopub.status.busy": "2026-04-13T19:55:15.548883Z",
"iopub.status.idle": "2026-04-13T19:55:19.421222Z",
"shell.execute_reply": "2026-04-13T19:55:19.420842Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"[aiter] import [module_aiter_enum] under /home/AMD/diptodeb/micromamba/envs/flashinfer-rocm-devel/lib/python3.12/site-packages/aiter/jit/module_aiter_enum.so\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"flashinfer: 0.5.3+amd.1.dev9\n",
"torch: 2.9.1+rocm7.2.0.git7e1940d4\n",
"PyTorch HIP / ROCm build: 7.2.26015-fc0010cf6a\n",
"Detected system ROCm version: 7.2.0\n",
"Architectures with AMD FlashInfer ports: gfx942, gfx950\n",
"GPU count (torch): 1\n",
"Device indices FlashInfer treats as supported Instinct (rocminfo): (0,)\n",
"Using device: cuda:0\n"
]
}
],
"execution_count": null,
"id": "0ebe68e6",
"metadata": {},
"outputs": [],

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants