Skip to content

[Bug Report] 使用PVM Deployment,在aws C8i linux2023机器上内核无法启动 #161

@kangcp

Description

@kangcp

Summary

按照PVM Deployment,更新内核,重启后,机器无法正常启动

Environment

  • CubeSandbox version / commit: v0.2.0

  • Host OS and kernel version: 6.1.166-197.305.amzn2023.x86_64
    机器详细信息:
    NAME="Amazon Linux"
    VERSION="2023"
    ID="amzn"
    ID_LIKE="fedora"
    VERSION_ID="2023"
    PLATFORM_ID="platform:al2023"
    PRETTY_NAME="Amazon Linux 2023.11.20260413"

  • KVM info (modinfo kvm): /lib/modules/6.1.166-197.305.amzn2023.x86_64/kernel/arch/x86/kvm/kvm.ko

  • Deployment mode: single-node

  • Relevant component:

Steps to Reproduce

  1. 下载kernel-6.6.69_cube.pvm.host..rpm和 kernel-headers-6.6.69_cube.pvm.host..rpm

  2. 直接安装会失败:rpm -ivh kernel-6.6.69_cube.pvm.host..rpm, 错误信息如下:
    Verifying... ################################# [100%]
    Preparing... ################################# [100%]
    package kernel-1:6.1.166-197.305.amzn2023.x86_64 (which is newer than kernel-6.6.69_cube.pvm.host.005.x_gb85200d80fa2-1.x86_64) is already installed
    [root@s113v112 ~]# rpm -ivh kernel-headers-6.6.69_cube.pvm.host.
    .rpm
    error: Failed dependencies:
    kernel-headers is obsoleted by kernel-headers-6.6.69_cube.pvm.host.005.x_gb85200d80fa2-1.x86_64

  3. 修改安装方式(如下方式可以安装成功):
    dnf install --allowerasing ./kernel-6.6.69_cube.pvm.host..rpm
    dnf install --allowerasing ./kernel-headers-6.6.69_cube.pvm.host.
    .rpm

  4. 设置PVM内核为默认启动方式(aws上稍微麻烦点,过程如下):

第 1 步:生成模块依赖

KVER=$(ls /lib/modules/ | grep cube)
sudo depmod ${KVER}

第 2 步:验证 modules.dep 已生成

ls -la /lib/modules/${KVER}/modules.dep

第 3 步:生成 initramfs

sudo dracut --force /boot/initramfs-${KVER}.img ${KVER}

第 4 步:验证 initramfs 已生成

ls -la /boot/initramfs-*cube*

第 5 步:添加 GRUB 条目 + 设为默认

sudo grubby --add-kernel=/boot/vmlinuz-${KVER} \
  --initrd=/boot/initramfs-${KVER}.img \
  --title="PVM Host Kernel 6.6.69" \
  --copy-default
sudo grubby --set-default=/boot/vmlinuz-${KVER}
# 验证
sudo grubby --default-kernel
grubby --info=ALL | grep -E "^kernel|^index|^title"
  1. 配置所需的内核引导参数
    bash <(curl -fsSL
    https://raw.githubusercontent.com/TencentCloud/CubeSandbox/master/deploy/pvm/grub/host_grub_config.sh)

  2. 重启

Expected Behavior

重启之后,设置成功

Actual Behavior

机器无法启动

Additional Context

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions